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Preface 


This book is written for those students of economics intent on learning the basic mathe¬ 
matical methods that have become indispensable for a proper understanding of the current 
economic literature. Unfortunately, studying mathematics is, for many, something akin to 
taking bitter-tasting medicine—absolutely necessary, but extremely unpleasant. Such an at¬ 
titude. referred to as “math anxiety,” has its roots—we believe—largely in the inauspicious 
manner in which mathematics is often presented to students. In the belief that conciseness 
means elegance, explanations offered are frequently too brief for clarity, thus puzzling stu¬ 
dents and giving them an undeserved sense of intellectual inadequacy. An overly formal 
style of presentation, when not accompanied by any intuitive illustrations or demonstra¬ 
tions of “relevance,” can impair motivation. An uneven progression in the level of material 
can make certain mathematical topics appear more difficult than they actually are. Finally, 
exercise problems that are excessively sophisticated may tend to shatter students’ confi¬ 
dence, rather than stimulate thinking as intended. 

With that in mind, wc have made a serious effort to minimize anxiety-causing features. 
To the extent possible, patient rather than cryptic explanations arc offered. The style is de¬ 
liberately informal and “reader-friendly.’' As a matter of routine, wc try to anticipate and 
answer questions that are likely to arise in the students’ minds as they read. To underscore 
the relevance of mathematics to economics, we let the analytical needs of economists mo¬ 
tivate the study of the related mathematical techniques and then illustrate the latter with ap¬ 
propriate economic models immediately afterward. Also, the mathematical tool kit is built 
up on a carefully graduated schedule, with the elementary tools serving as stepping stones 
to the more advanced tools discussed later. Wherever appropriate, graphic illustrations give 
visual reinforcement to the algebraic results. And we have designed the exercise problems 
as drills to help solidify grasp and bolster confidence, rather than exact challenges that 
might unwittingly frustrate and intimidate the novice. 

In this book, the following major types of economic analysis arc covered: statics (equi¬ 
librium analysis), comparative statics, optimization problems (as a special type of statics), 
dynamics, and dynamic optimization. To tackle these, the following mathematical methods 
are introduced in due course: matrix algebra, differential and integral calculus, differential 
equations, difference equations, and optimal control theory. Because of the substantial 
number of illustrative economic models- both macro and micro—appearing here, this 
book should be useful also to those who arc already mathematically trained butstiil in need 
of a guide to usher them from the realm of mathematics to the land of economics. For the 
same reason, the book should not only serve as a text for a course on mathematical meth¬ 
ods, but also as supplementary reading in such courses as microeconomic theory, macro- 
economic theory, and economic growth and development. 

We have attempted to retain the principal objectives and style of the previous editions. 
However, the present edition contains several significant changes. The material on mathe¬ 
matical programming is now presented earlier in a new Chap, 13 entitled “Further Topics 
in Optimization”This chapter has two major themes: optimization with inequality con¬ 
straints and the envelope theorem. Under the first theme, the Kuhn-Tucker conditions arc 
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developed in much the same manner as in the previous edition. However, the topic has been 
enhanced with several new economic applications, including peak-load pricing and con¬ 
sumer rationing. The second theme is related to the development of the envelope theorem, 
the maximum-value function, and the notion of duality. By applying the envelope theorem 
to various economic models, we derive important results such as Roy’s identity, Shephard’s 
lemma, and Hotelling’s lemma, 

The second major addition to this edition is a new Chap. 20 on optimal control theory. 
The purpose of this chapter is to introduce the reader to the basics of optimal control and 
demonstrate how it may be applied in economics, including examples from natural re¬ 
source economics and optimal growth theory. The material in this chapter is drawn i n great 
part from the discussion of optimal control theory in Elements of Dynamic Optimization by 
Alpha C. Chiang (McGraw-Hill 1992, now published by Waveland Press, Inc.), which pre¬ 
sents a thorough treatment of both optimal control and its precursor, calculus of variations. 

Aside from the two new chapters, there are several significant additions and refinements 
to this edition. In Chap. 3 we have expanded the discussion of solving higher-order poly¬ 
nomial equations by factoring (Sec. 3.3). In Chap. 4, a new section on Markov chains has 
been added (Sec. 4.7). And, in Chap. 5, wc have introduced the checking of the rank of a 
matrix via an echelon matrix (Sec. 5.1), and the Hawkins-Simon condition in connection 
with the Leontief input-output model (See. 5.7). With respect to economic applications, 
many new examples have been added and some ofthe existing applications have been en¬ 
hanced. A linear version ofthe IS-TM model has been included in Sec. 5.6, and a more gen¬ 
eral form of the model in Sec. 8.6 has been expanded to encompass both a closed and open 
economy, thereby demonstrating a much richer application of comparative statics to 
general-function models. Other additions include a discussion of expected utility and risk 
preferences (Sec. 9.3), a profit-maximization model that incorporates the Cobb-Douglas 
production function (Sec, 11.6), and a two-period intertemporal choice problem 
(See. 12.3). Finally, the exercise problems have been revised and augmented, giving stu¬ 
dents a greater opportunity to hone their skills. 
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Chapter 


The Nature of 
Mathematical Economics 


Mathematical economics is not a distinct branch of economics in the sense that public fi¬ 
nance or intcrnaiional trade is. Rather, it is an approach to economic analysis, in which the 
economist makes use of mathematical symbols in the statement of the problem and also 
draws upon known mathematical theorems to aid in reasoning. As far as the specific sub¬ 
ject matter of analysis goes, it can be micro- or macroeconomic theory, public finance, 
urban economics, or what not. 

Using the term mathematical economics in the broadest possible sense, one may very 
well say that every elementary textbook of economies today exemplifies mathematical eco¬ 
nomics insofar as geometrical methods are frequently utilized to derive theoretical results. 
More commonly, however, mathematical economics is reserved to describe cases employ¬ 
ing mathematical techniques beyond simple geometry, such as matrix algebra, differential 
and integral calculus, differential equations, difference equations, etc. It is the purpose of 
this book to introduce the reader to the most fundamental aspects of these mathematical 
methods- those encountered daily in the current economic literature. 


1,1 Mathematical versus Nonmathematical Economics 

Since mathematical economics is merely an approach to economic analysis, it should not 
and does not fundamentally differ from the ^^mathematical approach to economic analy¬ 
sis. The purpose of any theoretical analysis, regardless of the approach, is always to derive 
a set of conclusions or theorems from a given set of assumptions or postulates via a process 
of reasoning. The major difference between “mathematical economics” and “literary eco¬ 
nomics” is twofold: First, in the former, the assumptions and conclusions are stated in 
mathematical symbols rather than words and in equations rather than sentences. Second, 
in place of literary logic, use is made of mathematical theorems—of which there exists an 
abundance to draw upon—in the reasoning process. Inasmuch as symbols and words are 
really equivalents (witness the fact that symbols are usually defined in words), it matters lit¬ 
tle which is chosen over the other. But it is perhaps beyond dispute that symbols are more 
convenient to use in deductive reasoning, and certainly arc more conducive to conciseness 
2 and preciseness of statement. 
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The choice between literary logic and mathematical logic, again, is a matter of little im¬ 
port, but mathematics has the advantage of forcing analysts to make their assumptions ex¬ 
plicit at every stage of reasoning. This is because mathematical theorems are usually stated 
in the "if-then” form, so that in order to tap the "then” (result) part of the theorem for their 
use, they must first make sure that the “if" (condition) part does conform to the explicit 
assumptions adopted. 

Granting these points, though, one may still ask why it is necessary to go beyond geo¬ 
metric methods. The answer is that while geometric analysis has the important advantage 
of being visual, it also suffers from a serious dimensional limitation. In the usual graphical 
discussion of indifference curves, for instance, the standard assumption is that only two 
commodities are available to the consumer. Such a simplifying assumption is not willingly 
adopted but is forced upon us because the task of drawing a three-dimensional graph is ex¬ 
ceedingly difficult, and the construction of a four- (or higher) dimensional graph is actually 
a physical impossibility. To deal with the more general case of 3. 4, or n goods, we must 
instead resort to the more flexible tool of equations. This reason alone should provide suf¬ 
ficient motivation for the study of mathematical methods beyond geometry. 

In short, wc see that the mathematical approach has claim to the following advantages: 
(1) The “language” used is more concise and precise; (2) there exists a wealth of mathe¬ 
matical theorems at our service; (3) in forcing us to state explicitly all our assumptions as 
a prerequisite to the use of the mathematical theorems, it keeps us from the pitfall of an un¬ 
intentional adoption of unwanted implicit assumptions; and (4) it allows us to treat the 
general n-variable case. 

Against these advantages, one sometimes hears the criticism that a mathematically de¬ 
rived theory is inevitably unrealistic. However, this criticism is not valid. In fact, the epithet 
"unrealistic” cannot even be used in criticizing economic theory in general, whether or not 
the approach is mathematical. Theory is by its very nature an abstraction from the real 
world. It is a device for singling out only the most essential factors and relationships so that 
we can study the crux of the problem at hand, free from the many complications that do 
exist in the actual world. Thus the statement “theory lacks realism" is merely a truism that 
cannot be accepted as a valid criticism of theory. By the same token, it is quite meaningless 
to pick out any one approach to theory as “unrealistic.” for example, the theory of linn 
under pure competition is unrealistic, as is the theory of finn under imperfect competition, 
but whether these theories are derived mathematically or not is irrelevant and immaterial. 

To take advantage of the wealth of mathematical tools, one must of course first acquire 
those tools. Unfortunately, the tools that are of interest to economists are widely scattered 
among many mathematics courses—too many to lit comfortably into the plan of study of a 
typical economics student. The service the present volume performs is to gather in one 
place the mathematical methods most relevant to the economics literature, organize them 
into a logical order of progression, fully explain each method, and then immediately illus¬ 
trate how the method is applied in economic analysis. By tying together the methods 
and their applications, the relevance of mathematics to economics is made more transpar¬ 
ent than is possible in the regular mathematics courses where the illustrated applications 
are predominantly tied to physics and engineering. Familiarity with the contents of this 
book (and, if possible, also its sequel volume: Alpha C. Chiang, Elements of Dynamic 
Optimization, McGraw-Hill, 1992, now published by Waveland Press, Inc.) should there¬ 
fore enable you to comprehend most of the professional articles you will come across in 
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such periodicals as the American Economic Review, Quarterly Journal of Economics, 
Journal of Political Economy Review of Economics and Statistics, and Economic Journal . 
Those of you who, through this exposure, develop a serious interest in mathematical 
economics can then proceed to a more rigorous and advanced study of mathematics. 


1.2 Mathematical Economics versus Econometrics 


The term mathematical economics is sometimes confused with a related term, economet¬ 
rics. As the “metric” part of the latter term implies, econometrics is concerned mainly with 
the measurement of economic data. Hence it deals with the study of empirical observations 
using statistical methods of estimation and hypothesis testing. Mathematical economics, on 
the other hand, refers to the application of mathematics to the purely theoretical aspects of 
economic analysis, with little or no concern about such statistical problems as the errors of 
measurement of the variables under study. 

In the present volume, we shall confine ourselves to mathematical economics. That is, 
we shall concentrate on the application of mathematics to deductive reasoning rather than 
inductive study, and as a result we shall be dealing primarily with theoretical rather than 
empirical material. This is, of course, solely a matter of choice of the scope of discussion, 
and it is by no means implied that econometrics is less important. 

Indeed, empirical studies and theoretical analyses are often complementary and mutu¬ 
ally reinforcing. On the one hand, theories must be tested against empirical data for valid¬ 
ity before they can be applied with confidence. On the other, statistical work needs 
economic theory as a guide, in order to determine the most relevant and fruitful direction 
of research. 

In one sense, however, mathematical economics may be considered as the more basic of 
the two: for, to have a meaningful statistical and econometric study, a good theoretical 
framework—preferably in a mathematical formulation—is indispensable. Hence the 
subject matter of the present volume should be useful not only for those interested in theo¬ 
retical economics, but also for those seeking a foundation for the pursuit of econometric 
studies. 
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As mentioned before, any economic theory is necessarily an abstraction from the real 
world. For one thing, the immense complexity of the real economy makes it impossible for 
us to understand all the interrelationships at once; nor, for that matter, are all these interre¬ 
lationships of equal importance for the understanding of the particular economic phenom¬ 
enon under study, The sensible procedure is, therefore, to pick out what appeals to our 
reason to be the primary factors and relationships relevant to our problem and to focus our 
attention on these alone. Such a deliberately simplified analytical framework is called an 
economic model, since it is only a skeletal and rough representation of the actual economy. 

2.1 Ingredients of a Mathematical Model _ 

An economic model is merely a theoretical framework, and there is no inherent reason why 
it must be mathematical. If the model ix mathematical, however, it will usually consist of a 
set of equations designed to describe the structure of the model. By relating a number of 
variables To one another in certain ways, these equations give mathematical form to the set 
of analytical assumptions adopted. Then, through application of the relevant mathematical 
operations to these equations, we may seek to derive a set of conclusions which logically 
follow' from those assumptions. 

Variables, Constants, and Parameters 

A variable is something whose magnitude can change, i.c., something that can take on dif¬ 
ferent values. Variables frequently used in economics include price, profit, revenue, cost, 
national income, consumption, investment, imports, and exports. Since each variable can 
assume various values, it must be represented by a symbol instead of a specific number. For 
example, wc may represent price by P, profit by jt, revenue by R, cost by C. national in¬ 
come by Y, and so forth. When we write P = 3 or C = 18, however, we are “freezing" 
these variables at specific values (in appropriately chosen units). 

Properly constructed, an economic model can be solved to give us the solution values 
of a certain set of variables, such as The market-clearing level of price, or the profit- 
maximizing level of output. Such variables, whose solution values wc seek from the model, 
are known as endogenous variables (originating from within). However, the model may 
also contain variables which are assumed to be determined by forces external to the model. 
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and whose magnitudes are accepted as given data only; such variables are called exogenous 
variables {originating from without). It should be noted that a variable that is endogenous 
to one model may very well be exogenous to another. In an analysis of the market determi¬ 
nation of wheat price (P), for instance, the variable P should definitely be endogenous; but 
in the framework of a theory of consumer expenditure, P would become instead a datum to 
the individual consumer, and must therefore be considered exogenous. 

Variables frequently appear in combination with fixed numbers or constants, such as 
in the expressions IP or 0.5/?. A constant is a magnitude that docs not change and is there¬ 
fore the antithesis of a variable. When a constant is joined to a variable, it is often referred 
to as the coefficient of that variable. However, a coefficient may be symbolic rather than 
numerical. We can, for instance, let the symbol a stand for a given constant and use the 
expression aP in lieu of IP in a model, in order to attain a higher level of generality 
(see Sec, 2.7). This symbol a is a rather peculiar case—it is supposed to represent a given 
constant, and yet, since we have not assigned to it a specific number, it can take virtually 
any value. In short, it is a constant that is variable! To identify its special status, we give it 
the distinctive name parametric constant {or simply parameter). 

It must be duly emphasized that, although different values can be assigned to a parame¬ 
ter, it is nevertheless to be regarded as a datum in the model. It is for this reason that peo¬ 
ple sometimes simply say “constant" even when the constant is parametric. In this respect, 
parameters closely resemble exogenous variables, for both are to be treated as “givens" in 
a model. This explains why many writers, for simplicity, refer to both collectively with the 
single designation “parameters.” 

As a matter of convention, parametric constants are normally represented by the sym¬ 
bols a,/;, c, or their counterparts in the Greek alphabet: a, ft, and y. But other symbols nat¬ 
urally are also permissible. As for exogenous, variables, in order that they can be visually 
distinguished from their endogenous cousins, wc shall follow the practice of attaching a 
subscript 0 to the chosen symbol. For example, if P symbolizes price, then P 0 signifies an 
exogenously determined price. 

Equations and identities 

Variables may exist independently, but they do not really become interesting until they are 
related to one another by equations or by inequalities. At this moment we shall discuss 
equations only. 

In economic applications wc may distinguish between three types of equation: defini¬ 
tional equations, behavioral equations, and conditional equations. 

A definitional equation sets up an identity between two alternate expressions that have 
exactly the same meaning. For such an equation, the identical-equality sign = (read: “is 
identically equal to") is often employed in place of the regular equals sign =, although the 
latter is also acceptable. As an example, total profit is defined as the excess of total revenue 
over total cost; we can therefore write 

n = R-C 

A behavioral equation , on the other hand, specifies the manner in which a variable be¬ 
haves in response to changes in other variables. This may involve either human behavior 
(such as the aggregate consumption pattern in relation to national income) or nonhuman 
behavior (sueh as how total cost of a firm reacts to output changes). Broadly defined. 
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behavioral equations can be used to describe the general institutional setting ol'a model, in¬ 
cluding the technological (c.g., production function) and legal (e.g., tax structure) aspects. 
Before a behavioral equation can be written, however, it is always necessary to adopt defi¬ 
nite assumptions regarding the behavior pattern of the variable in question. Consider the 
two cost functions 

C = 75 + 10£ (2.1) 

C=U0+Q 2 (2.2) 

where Q denotes the quantity of output. Since the two equations have different forms, the 
production condition assumed in each is obviously different from the other. In (2.1), the 
fixed cost (the value of C when Q = 0) is 75, whereas in (2.2) it is 110. The variation in cost 
is also different, In (2.1). for each unit increase in 0, there is a constant increase of 10 in C. 
But in (2.2), as Q increases unit alter unit, C will increase by progressively larger amounts. 
Clearly, it is primarily through the specification of the form of the behavioral equations that 
we give mathematical expression to the assumptions adopted for a model. 

As the thud type, a conditional equation states a requirement to be satisfied. For exam¬ 
ple, in a model involving the notion of equilibrium, we must set up an equilibrium condi¬ 
tion, which describes the prerequisite for the attainment of equilibrium. Two of the most 
familiar equilibrium conditions in economics are 

Qd = Qs [quantity demanded = quantity supplied] 

and S = I [intended saving = intended investment] 

which pertain, respectively, to the equilibrium of a market model and the equilibrium of the 
national-income model in its simplest form. Similarly, an optimization model either derives 
or applies one or more optimization conditions. One such condition that comes easily to 
mind is the condition 

MC — MR [marginal cost — marginal revenue] 

in the theory of the firm. Because equations of this type are neither definitional nor behav¬ 
ioral, they constitute a class by themselves. 

2,2 The Real-Number System ___ 

Equations and variables are the essential ingredients of a mathematical model. But since 
the values that an economic variable takes are usually numerical, a few words should be 
said about the number system. Here, we shall deal only with so-called real numbers. 

Whole numbers such as 1, 2, 3, ... arc called positive integers; these are the numbers 

most frequently used in counting. Their negative counterparts -1, -2, -3_are called 

negative integers; these can be employed, for example, to indicate subzero temperatures (in 
degrees). The number 0 (zero), on the other hand, is neither positive nor negative, and is in 
that sense unique. Let us lump all the positive and negative integers and the number zero 
into a single category, referring to them collectively as the set of all integers. 

Integers, of course, do not exhaust all the possible numbers, for we have fractions, such 
as 1, j. and which—if placed on a ruler—would fall between the integers. Also, we have 
negative fractions, such as - i and -1. Together, these make up the set of all fractions. 
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The common property of all fractional numbers is that each is expressible as a ratio of 
two integers. Any number that can be expressed as a ratio of two integers is called a ratio¬ 
nal number. But integers themselves are also rational, because any integer// can be consid¬ 
ered as the ratio n /L The set of all integers and the set of all fractions together form the set 
of all rational numbers . An alternative defining characteristic of a rational number is that it 
is expressible as either a terminating decimal (e.g., » = 0.25) or a repeating decimal (c.g., 
j = 0.3333 ...), where some number or series of numbers to the right of the decimal point 
is repeated indefinitely. 

Once the notion of rational numbers is used there naturally arises the concept of irra¬ 
tional numbers —numbers that cannot be expressed as ratios of a pair of integers. One ex¬ 
ample is the number s/2 = 1.4142..., which is a nonrepeating, nonterminaiing decimal. 
Another is the special constant tt = 3.1415... (representing the ratio of the circumference 
of any circle to its diameter), which is again a nonrepeating, nonterminaiing decimal, as is 
characteristic of all Irrational numbers. 

Each irrational number, if placed oil a ruler, would fall between two rational numbers, 
so that, just as the fractions fill in the gaps between the integers on a ruler, the irrational 
numbers fill in the gaps between rational numbers. The result of this filling-in process is a 
continuum of numbers, all of which are so-called real numbers. This continuum constitutes 
the set of all real numbers, which is often denoted by the symbol R. When the set 7? is dis¬ 
played on a straight line (an extended ruler), we refer to the line as the real line. 

In Fig. 2.1 are listed (in the order discussed) all the number sets, arranged in relationship 
to one another. If we read from bottom to top, however, we find in effect a classificatory 
scheme in which the set of real numbers is broken down into its component and subcom¬ 
ponent number sets. This figure therefore is a summary of the structure of the real-number 
system. 

Real numbers are all we need for the first 15 chapters of this book, but they are not the 
only numbers used in mathematics, In fact, the reason for the term real is that there are also 
41 imaginary*' numbers, which have to do with the square roots of negative numbers. That 
concept will be discussed later, in Chap. 16. 

23 The Concept of Sets _ 

Wc have already employed the word set several times. Inasmuch as the concept of sets 
underlies every branch of modem mathematics, it is desirable to familiarize ourselves at 
least with its more basic aspects. 
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Set Notation 

A set is simply a collection of distinct objects. These objects may be a group of (distinct) 
numbers, persons, food items, or something else. Thus, all the students enrolled in a par¬ 
ticular economics course can be considered a set, just as the three integers 2.3, and 4 can 
form a set. The objects in a set are called the elements of the set. 

There are two alternative ways of writing a set: by enumeration and by description. If 
we let S represent the set of three numbers 2, 3, and 4. we can write, by enumeration of the 
elements, 

S — {2, 3, 4) 

But if we let / denote the set of all positive integers, enumeration becomes difficult, and we 
may instead simply describe the elements and write 

/ = {.t | x a positive integer) 

which is read as follows: ‘7 is the set of all (numbers) x, such that x is a positive integer." 
Note that a pair of braces is used to enclose the set in either case. In the descriptive 
approach, a vertical bar (or a colon) is always inserted to separate the general designating 
symbol for the elements from the description of the elements. As another example, the 
set of all real numbers greater than 2 but less than 5 (call it J) can be expressed symboli¬ 
cally as 

J = \x | 2 < x < 5) 

Here, even the descriptive statement is symbolically expressed. 

A set with a finite number of elements, exemplified by the previously given set S, is 
called a finite set. Set/and set J, each with an infinite number of elements, are, on the other 
hand, examples of an infinite set. Finite sets are always denumerable (or countable), i.e., 

their elements can be counted one by one in the sequence 1, 2. 3.Infinite sets may, 

however, be either denumerable (set /). or nondemmerable (set./). I n the latter case, there 
is no way to associate the elements of the set with the natural counting numbers 1,2,3,..., 
and thus the set is not countable. 

Membership in a set is indicated by the symbol e (a variant of the Greek letter epsilon e 
for “element"), which is read as follows: “is an element of." Thus, for the two sets S and I 
defined previously, we may write 

2e£ 3 <= S 8 el 9 £l (etc.) 

but obviously 8 f .S' (read: “8 is not an element of set 5”'). If we use the symbol R to denote 
the set of all real numbers, then the statement “x is some real number" can be simply 
expressed by 


x £ R 


Relationships between Sets 

When two sets are compared with each other, several possible kinds of relationship may be 
observed. If two sets S\ and Si happen to contain identical elements, 


Si = {2. 7. a, f ) and S 2 = {2, a , 7, /} 
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then Si and S 2 are said to be equal (Si = S 2 ). Note that the order of appearance of the ele¬ 
ments in a set is immaterial. Whenever we find even one element to be different in any two 
sets, however, those two sets arc not equal. 

Another kind of set relationship is that one set may be a subset of another set. If we have 
two sets 

S ={1,3,5,7,91 and T = {3,7} 

then 7'is a subset of S, because every element of T is also an element of 5. A more formal 
statement of this is: T is a subset of S if and only if a- e T implies x € S. Using the set 
inclusion symbols C (is contained in) and D (includes), wc may then write 

T cS or Sd T 

It is possible that two given sets happen to be subsets of each other. When this occurs, how¬ 
ever, we can be sure that these two sets arc equal. To state this formally: we can have 
Si c S 2 and S 2 C Si if and only if 5) — 5?. 

Note that, whereas the € symbol relates an individual element to a set, the c symbol re¬ 
lates a subset to a set. As an application of this idea, we may state on the basis of Fig. 2.1 
that the set of all integers is a subset of the set of all rational numbers. Similarly, the set of 
all rational numbers is a subset of the set of all real numbers. 

How many subsets can be formed from the live elements in the set S = { 1. 3, 5, 7,9}? 
First of all, each individual element of.S' can count as a distinct subset of 5, such as {11 and 
(3}. But so can any pair, triple, or quadruple of these elements, such as {1,3}, {1.5}, and 
{3. 7.91. Any subset that does not contain all the elements of 5 is called a proper subset of 
5. But the set S itself (with all its five elements) can also be considered as one of its own 
subsets—every element of S is an element of 5, and thus the set S itself fulfills the defini¬ 
tion of a subset. This is, of course, a limiting case, that from which we get the largest pos¬ 
sible subset of S, namely, 5 itself. 

At the other extreme, the smallest possible subset of S is a set that contains no element 
at all. Such a set is called the null set, or empty set, denoted by the symbol 0 or {}. The rea¬ 
son for considering the null set as a subset of S is quite interesting: If the null set is not a 
subset of 5 (0 £ .S’), then 0 must contain at least one element x such that x f S. But since 
by definition the null set has no element whatsoever, we cannot say that 0 5; hence the 

null set is a subset of S. 

It is extremely important to distinguish the symbol 0 or {) dearly from the notation 
{0}; the former is devoid of dements, but the latter docs contain an element, zero. The null 
set is unique; there is only one such set in the whole world, and it is considered a subset of 
any set that can be conceived. 

Counting all the subsets of S, including the two limiting cases S and 0, wc find a total 
of 2 5 - 32 subsets. In general, if a set has n elements, a total of 2" subsets can be formed 
from those elements, f 

f Given a set with n elements { o f b, c, ,,., n] we may first classify its subsets into two categories: one 
with the element a in it, and one without. Each of these two can be further classified into two 
subcategories: one with the element b in it, and one without. Note that by considering the second 
element b, we double the number of categories in the classification from 2 to 4 (= 2 2 ). By the same 
token, the consideration of the element c will increase the total number of categories to 8 (= 2 3 ). 
When all n elements are considered, the total number of categories will become the total number of 
subsets, and that number is 2 n . 
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Example 3 


Example 4 


As a third possible type of set relationship, two sets may have no elements in common 
at all. In that case, the two sets are said to be disjoint. For example, the set of all positive in¬ 
tegers and the set of all negative integers are mutually exclusive; thus they are disjoint sets. 

A fourth type of relationship occurs when two sets have some elements in common but 
some elements peculiar to each. In that event, the two sets are neither equal nor disjoint; 
also, neither set is a subset of the other. 

Operations on Sets 

When we add, subtract, multiply, divide, or take the square root of some numbers, we are 
performing mathematical operations. Although sets are different from numbers, one can 
similarly perform certain mathematical operations on them. Three principal operations to 
be discussed here involve the union, intersection, and complement of sets. 

To take the union of two sets .4 and B moans to form a new set containing those elements 
(and only those elements) belonging to A, or to S, or to both A and B. The union set is sym¬ 
bolized by A u B (read: “A union B”). 

If A - (3, 5, 7} and 8 = (2, 3, 4, 8}, then 

AuB = {2, 3,4, 5, 7, 8| 

This example, incidentally, illustrates the case in which two sets A and B are neither equal 
nor disjoint and in which neither is a subset of the other. 

Again referring to fig. 2.1, we see that the union of the set of all integers and the set of all 
fractions is the set of all rational numbers. Similarly, the union of the rational-number set 
and the irrational-number set yields the set of all real numbers. 

The intersection of two sets A and B, on the other hand, is a new set which contains those 
elements (and only those elements) belonging to both A and B. The intersection set is sym¬ 
bolized by An B (read: “A intersection B’j. 

From the sets A and B in Example 1, we can write 

An 8 = (3) 

If A = (-3, 6,10) and B = (9, 2, 7,4), then A fi fl = 0. Set A and set B are disjoint; there¬ 
fore their intersection is the empty set—no element is common to A and B. 

It is obvious that intersection is a more restrictive; concept than union. In the former, 
only the elements common to A and B arc acceptable, whereas in the latter, membership in 
either A or B is sufficient to establish membership in the union set. The operator symbols 
fl and U—which, incidentally, have the same kind of general status as the symbols -J~. +, 
4-, etc.—therefore have the connotations “and” and “or,” respectively. This point can be 
better appreciated by comparing the following formal definitions of intersection and union: 

Intersection: A n B = (.* | x e A and x e B\ 

Union: A u B = {x \ x e A or x e B] 
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Example 6 
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What about the complement of a sot? To explain this, let us first introduce the concept of 
the universal set. In a particular context of discussion, if the only numbers used are the set 
of the first seven positive integers, we may refer to it as the universal set U. Then, with a 
given set, say, A = {3. 6, 7}, we ean define another set A (read; “the complement of A") as 
the set that contains all the numbers in the universal set U that arc not in the setJ. That is, 

A = {x | x 6 V and x 4- M = {1, 2, 4, 5} 

Note that, whereas the symbol U has the connotation “or” and the symbol D means “and,” 
the complement symbol ~ carries the implication of "not.” 

If U = (5, 6, 7, 8, 9} and A = {5,6}, then A = {7, 8,9}. 


What is the complement of U? Since every object (number) under consideration is included 
in the universal set, the complement of U must be empty. Thus 0 = 0. 

The three types of set operation can be visualized in the three diagrams of Fig. 2.2, 
known as Venn diagrams. In diagram a , ihe points in the upper circle form a set A, and the 
points in the lower circle form a set B. The union of A and H then consists of the shaded area 
covering both circles. In diagram b are shown the same two sets (circles). Since their inter¬ 
section should comprise only the points common to both sets, only the (shaded) overlap¬ 
ping portion of the two circles satisfies the definition. In diagram c, lei the points in the 
rectangle be the universal set and let .4 be the set of points in the circle; then the comple¬ 
ment set A will be the (shaded) area outside the circle. 

Laws of Set Operations 

From Fig. 2.2, it may be noted that the shaded area in diagram a represents not only 
A J S but also B U A. Analogously, in diagram b the small shaded area is the visual 
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FIGURE 2.3 


Example 7 


AUBuf /i n s n c 




representation not only of A n B but also of B n A. When formalized, this result is known 
as the commutative law (of unions and intersections): 

= A An R = BCA 

These relations are very similar to the algebraic laws a 4- b = b + a and a x b — b x a. 

To take the union of three sets A, B, and C, we first take the union of any two sets and 
then “union” the resulting set with the third; a similar procedure is applicable to the inter¬ 
section operation. The results of such operations are illustrated in Fig. 2.3. It is interesting 
that the order in which the sets are selected for the operation is immaterial. This fact gives 
rise to the associative law (of unions and intersections): 

A U(fiUC) = (A JB)UC 

An(Bnc) = (AnB)nc 

These equations are strongly reminiscent of the algebraic laws a A- (b A- c) = (a A- b) + c 
and a x [b x <r) = (a x b) x c. 

There is also a law of operation that applies when unions and intersections arc used in 
combination. This is the distributive law (of unions and intersections): 

Au{BnC)^{A\JB)n{A UC) 

4n(5uC) = Mnfl)uG4nc) 

These resemble the algebraic law# x (b +c) = (a xb)A- (a x c). 

Verify the distributive law, given A = (4,5), B = (3, 6, 7}, and C = {2, 3). To verify the first 
part of the law, we find the left- and right-hand expressions separately: 

Left: 4 U (B n C) = 14, 5) U |3) = (3,4,5) 

Right: (A U 8) n (A U C) = (3,4, 5, 6, 7) fi {2, 3,4,5) = |3,4, 5) 
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Since the two sides yield the same result, the law is verified. Repeating the procedure for the 
second part of the law, we have 

Left: ,4 n(6 u C) = |4,5j h |2,3, 6,71=0 

Right: (A n £) j(ArC) = 0u 0 = 0 

Thus the law is again verified. 

To verify a law means to check by a specific example whether the law actually works 
out. If the law is valid, then any specific example ought indeed to work out. This implies 
that if the law does not check our in as many as one single example, then the law is invali¬ 
dated. On the other hand, the successful verification by specific examples (however many I 
does not in itself prove the law. To prove a law, it is necessary to demonstrate that the law is 
valid for all possible cases. The procedure involved in such a demonstration will be illus¬ 
trated later (sec, c.g.. Sec. 2.5}. 


EXERCISE 23 


1. Write the following in set notation; 

(<?) The set of all real numbers greater than 34. 

(b) The set of all real numbers greater than 8 but less than 65. 

2. Given the sets Si = j2, 4, 6 ), S 2 = \7,2, 6 |, Sj = [4,2, 6 }, and S 4 = (2,4], which of the 
following statements are true? 

{a) Si - Si (d) 3 * S 2 

(b) 5] — fl (set of real numbers) (e) 4 £ Sj 

(c) 8 e S 2 (0 S 4 C R 

3. Referring to the four sets given in Prob. 2, find: 

(a) S, u S 2 (c) S 2 r S 3 

(b) Si u Si (d) S 2 r S 4 

4. Which of the following statements are valid? 

(a) A U A = A (d) AJ U = U 

(b) An A = A (e) A 0 0 = 0 

(c) A U0 = A (f) AMJ = A 


( 9 ) $ 3 54 
(h) 0 C S 2 
(0 53 d [1,2} 

(e) S 4 n S 2 n S, 

(?) S 3 U Si U 5„ 

(g) The complement of 
A is A. 


5. Given A ~ <4,5,6(, B = {3, 4,6, 7\, and C = |2, 3,6], verify the distributive law. 

6. Verify the distributive law by means of Venn diagrams, with different orders of succes¬ 
sive shading, 

7. Enumerate all the subsets of the set {5,6, 7). 

8. Enumerate all the subsets of the set 5= ( 0 , b, c, d|. How many subsets are there 
altogether? 

9. Example 6 shows that 0 is the- complement of U, But since the null set is a subset of 
onyset, 0 must be a subset of U. Inasmuch as the term "complement of U" implies the 
notion of being not in U, whereas the term "subset of U" implies the notion of being in 
U, it seems paradoxical for 0 to be both of these. How do you resolve this paradox? 
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2.4 Relations and Functions 


Our discussion of sets was prompted by the usage of that term in connection with the varb 
ous kinds of numbers in our number system. However, sets can refer as well to objects other 
than numbers. In particular, we can speak of sets of “ordered pairs”—to be defined 
presently—which will lead us to the important concepts of relations and functions. 

Ordered Pairs 

In writing a set {a, ft], we do not care about the order in which the elements a and b appear, 
because by definition (a, b} = {h , a). The pair of elements a and b is in this case an un¬ 
ordered pair. When the ordering of a and b does carry a significance, however, we can write 
two different ordered pairs denoted by (a, b) and (b. a), which have the property that 
(a, b) =£■ (b, a) unless a = h, Similar concepts apply to a set with more than two elements, 
in which case we can distinguish between ordered and unordered triples, quadruples, quin¬ 
tuples, and so forth. Ordered pairs, triples, etc., collectively can be called ordered sets: they 
are enclosed with parentheses rather than braces. 


Example 1 To show the a 9 e and wei 9 ht of each student in a class, we can form ordered pairs (o, w), in 

- which the first element indicates the age (in years) and the second element indicates the 

weight (in pounds). Then (19, 127) and (127, 19) would obviously mean different things. 
Moreover, the latter ordered pair would hardly fit any student anywhere. 


Example 2 


When we speak of the set of all contestants in an Olympic game, the order in which they 
are listed is of no consequence and we have an unordered set. But the set (gold-medalist, 
silver-medalist, bronze-medalist} is an ordered triple. 


Ordered pairs, like other objects, can be elements of a set. Consider the rectangular 
(Cartesian) coordinate plane in Fig. 2.4, where ante axis and ay axis cross each other at a 
right angle, dividing the plane into four quadrants. This xy plane is an infinite set of points, 
each of which represents an ordered pair whose first element is an x value and the second 
element ay value. Clearly, the point labeled (4, 2) is different from the point (2, 4); thus 
ordering is significant here. 


FIGURE 2.4 
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Example 3 


With this visual understanding, we are ready to consider the process of generation of 
ordered pairs. Suppose, from two given sets, a- = 11,2} and v = {3, 4). we wish to form all 
the possible ordered pairs with the first element taken from set a and the second element taken 
from set y. The result will, of course, be the set of four ordered pairs (1,3), (1.4), (2,3), and 
(2, 4). This set is called the Cartesian product (named after Descartes), or direct product, of 
the sets x and vand is denoted byx x y (read: “x cross/’). It is important to remember that, 
while x and v are sets of numbers, the Cartesian product turns out to be a set of ordered pairs. 
By enumeration, or by description, we may express this Cartesian product alternatively as 

x x;' = {(l,3),(l,4),(2 1 3), (2, 4)} 

or x x y = {(a, b) \ a e x and b e y\ 

The latter expression may in fact be taken as the general definition of Cartesian product for 
any given sets x and y. 

To broaden our horizon, now let both x and y include all the real numbers. 1'hcn the re¬ 
sulting Cartesian product 

x x y = {(a,b) \ a e R and b e R] (2.3) 

will represent the set of all ordered pairs with real-valued dements. Besides, each ordered 
pair corresponds to a unique point in the Cartesian coordinate plane of Fig. 2.4, and, con¬ 
versely, each point in the coordinate plane also corresponds to a unique ordered pair in the 
set x x y. In view of this double uniqueness, a one-to-one correspondence is said lo exist 
between the set of ordered pairs in the Cartesian product (2.3) and the set of points in the 
rectangular coordinate plane. The rationale for the notation x x y is now easy lo perceive; 
we may associate it with the crossing of the x axis and they axis in Fig. 2.4. A simpler way 
of expressing the set x x y in (2.3) is to write it directly as R x R' this is also commonly 
denoted by R 2 . 

fixtending this idea, we may also define the Cartesian product of three sets x, v, and r as 
follows: 

x x v x z = {(a, h. c) | a € x, b e y, c € z) 

which is a set of ordered triples. Furthermore, if the sets x, y, and z each consist of all 
the real numbers, the Cartesian product will correspond to the set of all points in a three- 
dimensional space. This may be denoted by R x R / ft. or more simply, R-. In the present 
discussion, all the variables are taken to be real-valued; thus the framework will generally 
be R 2 . or R 2 .or R n . 

Relations and Functions 

Since any ordered pair associates a y value with an x value, any collection of ordered 
pairs—any subset of the Cartesian product (2.3)—will constitute a relation between y and 
x. Given an x value, one or morcy values will be specified by that relation. For convenience, 
we shall now write the elements of x x y generally as ( x , y) rather than as (a, /?), as was 
done in (2,3) - where both x and y arc variables. 

The set ((x, y) \ y= 2x] is a set of ordered pairs including, for example, (1, 2), (0, 0), and 
(-1, -2). It constitutes a relation, and its graphical counterpart is the set of points lying on 
the straight line y = 2x, as seen in Fig. 2.5. 
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Example 4 The set {(x, y) | y < x], which consists of such ordered pairs as (1, 0), (1, 1), and (1, -4), 

- constitutes another relation. In Fig, 2.5, this set corresponds to the set of all points in the 

shaded area which satisfy the inequality y < x. 

Observe that, when the x value is given, it may not always be possible to determine a 
uniquey value fro in a relation. In Example 4, the three exemplary ordered pairs show that 
if .T — 1, y can take various values, such as 0, 1, or —4. and yet in each case satisfy the 
stated relation. Graphically, two or more points of a relation may fall on a single vertical 
line in the xy plane. This is exemplified in Fig. 2.5, where many points in the shaded area 
(representing the relation y < x) fall on the broken vertical line labeled x = a. 

As a special ease, however, a relation may be such that for each x value there exists only 
one corresponding y value. The relation in Example 3 is a case in point. In such a case, y is 
said to be a function of x, and this is denoted by y = /(x), which is read as “y equals / of 
x.” [AW: / (x) does not mean /times x\ A function is therefore a set of ordered pairs with 
the property that any x value uniquely determines ay value/ It should be clear that a func¬ 
tion must be a relation, but a relation may not be a function. 

Although the definition of a function stipulates a unique y for each x ; the converse is not 
required. In other words, more than one x value may legitimately be associated with the 
same y value. This possibility is illustrated in Fig. 2.6, where the values x\ and xj in the x 
set are both associated with the same value (>o) in they set by the function v = /(x). 

A function is also called a mapping, or transformation; both words connote the action of 
associating one thing with another. In the statement y = f{x), the functional notation / 

T This definition of function corresponds to what would be called a single-valued function in the older 
terminology. What was formerly called a multivalued function Is now referred to as a relation or 
correspondence. 
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FIGURE 2.6 



may thus be interpreted to mean a rule by which the set x is “mapped” (“transformed”) into 
the sety. Thus we may write 

f:x y 

where the arrow indicates mapping, and the letter/symbolically specifies a rule of map¬ 
ping. Since/represents a particular rule of mapping, a different functional notation must 
be employed to denote another function that may appear in the same model. The customary 
symbols (besides/) used for this purpose are g, F t G, the Greek letters ^ (phi) and / (psi), 
and their capitals, <t> and ^. For instance, two variables y and i may both be functions of x. 
but if one function is written as y = f(x), the other should be written as z = g(x), or 
2 = it is also permissible, however, to write v = y(x) and z = z(x), thereby dis¬ 
pensing with the symbols/and g altogether. 

In the function y = f(x),x is referred to as the argument of the function, and y is called 
the value of the function. We shall also alternatively refer to x as the independent variable 
and y as the dependent variable . The set of all permissible values that x can take in a given 
context is known as the domain of the function, which may be a subset of the set of all real 
numbers. The v value into which an x value is mapped is called the image of that x value. 
The set of all images is called the range of the (inaction, which is the set of all values that 
they variable can take. Thus the domain pertains to the independent variable x, and the 
range has to do with the dependent variable y. 

As illustrated in Fig. 2.1a , we may regard the function/as a rule for mapping each point 
on some line segment (the domain) into some point on another line segment (the range). By 
placing the domain on the x axis and the range on they axis, as in Fig. 2.1b, however, we 
immediately obtain the familiar two-dimensional graph, in which the association between 
x values andy values is specified by a set of ordered pairs such as (x\ , yi) and (X 2 , > 2 ). 

In economic models, behavioral equations usually enter as functions. Since most vari¬ 
ables in economic models are by their nature restricted to being nonnegative real numbers, 1 
their domains are also so restricted. This is why most geometric representations in 

f We say "nonnegative" rather than "positive" when zero values are permissible. 
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FIGURE 2.7 
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economics are drawn only in the first quadrant. In general, we shall not bother to specify 
the domain of every function in every economic model. When no specification is given, it 
is to be understood that the domain (and the range) will only include numbers for which a 
function makes economic sense. 

Example 5 The t0 * a ' cos ^ ^ a ^ rm P er day * s a ^ unc fi° n of its daily output Q: C = 150 + 7Q. The firm 

- has a capacity limit of 100 units of output per day. What are the domain and the range of 

the cost function? Inasmuch as Q can vary only between 0 and 100, the domain is the set 
of values 0 < Q £ 100; or more formally, 

Domain = (Q | 0 < Q < 100) 

As for the range, since the function plots as a straight line, with the minimum C value at 150 
(when Q = 0) and the maximum C value at 850 (when 0 = 100), we have 

Range = (C 1150 < C < 850) 

Beware, however, that the extreme values of the range may not always occur where the 
extreme values of the domain are attained. 


EXERCISE 2.4 

1. Given Si = {3, 6, 9), 5^ - (a b\, and S$ = 1m, n\, find the Cartesian products: 

(a) Si x Sz ( b ) Sz x S 3 (c) 5$ x Si 

2. From the information in Prob. 1, find the Cartesian product Si x S 2 x S 3 . 

3. In general, is it true that Si x Si = S 2 x Si ? Under what conditions will these two 
Cartesian products be equa<? 

4. Does any of the following, drawn in a rectangular coordinate plane, represent a 
function? 

(a) A circle (c) A rectangle 

{ b ) A triangle (d) A downward-sloping straight line 

5. If the domain of the function y - 5 -b 3x is the set {x | 1 < x < 9), find the range of the 
function and express it as a set. 


1 
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6 . Far the function y = —x z , if the domain is the set of all non negative real numbers, what 
will its range be? 

7. In the theory of the firm, economists consider the total cost C to be a function of the 
output level Q:C= f(Q). 

(a) According to the definition of a function, should each cost figure be associated with 
a unique level of output? 

(fa) Should each level of output determine a unique cost figure? 

8 . If an output level Ch can be produced at a cost of Ci, then it must also be possible (by 
being less efficient) to produce Qi at a cost of Ci + SI, or Ci + $2, and so on. Thus it 
would seem that output Q does not uniquely determine total cost C. If so, to write 
C = f(Q ) would violate the definition of a function. How, in spite of the this reasoning, 
would you justify the use of the function C = f(Q)7 


2.5 Types of Function _ 

The expression v = f‘(x) is a general statement to the effect that a mapping is possible, but 
the actual rule of mapping is nouhereby made explicit. Now let us consider several specific 
types offunction, each representing a different rule of mapping. 

Constant Functions 

A function whose range consists of only one demon! is called a constant function. As an 
example, we cite the function 

.V = fix) = 7 

which is alternatively expressible as y = 7 or /|x) = 7. whose value stays the same 
regardless of the value of a. In the coordinate plane, such a function will appear as a hori¬ 
zontal straight line. In national-income models, when investment I is exogenously deter¬ 
mined. we may have an investment function ol'the form / = SI00 million,or / = which 
exemplifies the constant function. 

Polynomial Functions 

The constant function is actually a “degenerate'’ case of what are known as polynomial 
functions. The word polynomial means "multiterm," and a polynomial function of a single 
variable x has the general form 

y = uq 4- it |.v 4- U 2 X" + • • • + a„x (2.4) 

in which each term contains a coefficient as well as a nonnegauve-mtegev power of the 
variable x. (As will be explained later in this section, we can write x 1 =x and x° = 1 in 
general; thus the first two terms may betaken to be </qa° and c/jx 1 , respectively.) Note that. 

instead of the symbols a, b.c _ we have employed the subscripted symbols 

a 1 _ ,a n for the coefficients. This is motivated by two considerations; (1) we can econo¬ 

mize on symbols, since only the letter a is "used up" in this way; and (2) the subscript helps 
to pinpoint the location of a particular coefficient in the entire equation. For instance, in 
(2.4), u 2 is the coefficient of.v 2 , and so forth. 
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Depending on the value of the integer n (which specifics the highest power of x). we 
have several subclasses of polynomial function: 


Case ofn = 0 
Case of h = 1 
Case of n = 2 
Case of n = 3 


y = aa 
)’ = «u 


(l\X 


y = aa+ a\x +a 2 x- 
y = aa 4- a\x -Va^x' 


»2X 


[constant function] 
[linear function] 
[quadratic function] 
[cubic function] 


and so forth. The superscript indicators of the powers of .t arc cal led exponents. The high¬ 
est power involved, i.e., the value of n, is often called the degree of the polynomial func¬ 
tion; a quadratic function, for instance, is a second-degree polynomial, and a cubic function 
is a third-degree polynomial. 1 The order in which the several terms appear to the right of 
the equals sign is inconsequential; they may be arranged in descending order of power in¬ 
stead. Also, even though we have put the symbol y on the leil, it is also acceptable to write 
f(x) in its place. 

When plotted in the coordinate plane, a linear function will appear as a straight line, as 
illustrated in Fig. 2.8a. When .r — 0, the linear function yields y = un; thus the ordered pair 
(0, «o) is on the line. This gives us the so-calledy intercept (or vertical intercept), because 
it is at this point that the vertical axis intersects the line. The other coefficient, ai .measures 
the slope (the steepness of incline) of our line. This means that a unit increase in v will re¬ 
sult in an increment iny in the amount of a\. What Fig, 2.8a illustrates is the case ol'ai > 0. 
involving a positive slope and thus an upward-sloping line; if a\ < 0, the line will be 
downward-sloping. 

A quadratic function, on the other hand, plots as a parabola —roughly, a curve with a 
single built-in bump or wiggle. The particular illustration in Fig. 2.86 implies a negative a 2 : 
in the case of a 2 > 0, the curve will “open” the other way, displaying a valley rather than a 
hill. The graph of a cubic function will, in general, manifest two wiggles, as illustrated in 
Fig. 2.8c. These functions will be used quite frequently in the economic models subse¬ 
quently discussed. 


Rational Functions 

A function such as 

x - 1 

y ~ X 2 + 2x + 4 

in which y is expressed as a ratio of two polynomials in the variable x. is known as a ratio¬ 
nal function. According to this definition, any polynomial function must itself be a rational 
function, because it can. always be expressed as a ratio to 1, and lisa constant function. 
A special rational function that has interesting applications in economics is the function 

a 

v = - or xy = a 
x 

which plots as a rectangular hyperbola, as in Fig. 2.'id. Since the product of the two vari¬ 
ables is always a fixed constant in this case, this function may be used to represent that 
special demand curve—with price P and quantity Q on the two axes—for which the total 


+ In the several equations just cited, the last coefficient (a n ) is always assumed to be nonzero; 
otherwise the function would degenerate into a lower-degree polynomial. 
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The rectangular hyperbola drawn from xy = a never meets the axes, even if extended 
indefinitely upward and to the right. Rather, the curve approaches the axes asymptotically: 
as y becomes very large, the curve will come ever closer to the y axis but never actually 
reach it, and similarly for the x axis. The axes constitute the asymptotes of this function, 

Nonalgebraic Functions 

Any function expressed in terms of polynomials and/or roots (such as square root) of 
polynomials is an algebraic function. Accordingly, the functions discussed thus far are all 
algebraic. 

However, exponential functions such as y = b x , in which the independent variable ap¬ 
pears in the exponent, are nonalgebraic . The closely related logarithmic functions, such as 
y = log* .v, are also nonalgebraic. These two types of function have a special role to play in 
certain types of economic applications, and it is pedagogieally desirable to postpone their 
discussion to Chap. 10. Here, wc simply preview their general graphic shapes in Fig, 2,8c 
and f. Other types of nonalgebraic function arc the trigonometric (or circular ) functions, 
which we shall discuss in Chap. 16 in connection with dynamic analysis. Wc should add 
here that nonalgebraic functions are also known by the more esoteric name of transcen¬ 
dental functions. 

A Digression on Exponents 

In discussing polynomial functions, we introduced the term exponents as indicators of the 
power to which a variable (or number) is to be raised. The expression 6 2 means that 6 is to 
be raised to the second power; that is, 6 is to be multiplied by itself, or 6 2 = 6 x 6 = 36. In 
general, we define, for a positive integer n. 

x !! = x x x x • ■ ■ x x 

» terms 

and as a special case, we note that x 1 = x. From the general definition, it follows that for 
positive integers m and n , exponents obey the following rules: 

Rule I x* x = x m + n (for example, x* x x 4 = x 7 ) 


Proof 


x m X X 


" = < 


X X X X 


;)(; 


X X I X X X X 


X X 


m terms 


ti terms 


= X X X X 


X x = X 


rn A-n 


m — n terms 

Note that in this proof, we did not assign any specific value to the number x, or to the 
exponents m and n. Thus the result obtained is generally true. It is for this reason that 
the demonstration given constitutes a proof, as against a mere verification. The same can be 
said about the proof of Rule II which follows. 

X m / r 4 

Rule II - x*-" ( x A 0) I lor example, — = x 

.v" \ .t J 

m lerms 


Proof 


x 1 


X' 


x x x x • • x x 


X X X X • • • X X 


n Icmio 


= .t x x x - • • x x = x m 
- n terms 


n 


B 
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because the n terms in the denominator cancel out n of the m terms in the numerator. Note 
that the case of x = 0 is ruled out in the statement of this rule. This is because when a = 0, 
the expression x m /x n would involve division by zero, which is undefined. 

What if m < /?, say, m = 2 and n = 5? In that case we gel r according to Rule II, 
x w ' n = x ***\ a negative power of a. What docs this mean? The answer is actually supplied 
by Rule II itself: When m = 2 and n = 5 ; we have 

.r 2 a* x a 1 1 

A° X X X X A X X X A A X x X A X* 

Thus x ° = 1 / x *\ and this may be generalized into another rule: 

Rule III x~" = — (* =6 0) 


To raise a (nonzero) number lo a power of negative n is to take the reciprocal of its «th 
power. 

Another special case in the application of Rule II is when m = n, which fields the ex¬ 
pression x m ~" = x m " m = /'.To interpret the meaning of raising a number .v lo the zeroth 
power, we can write out the term x'"~ m in accordance with Rule 11, with the result that 
x”‘ /V" — 1. Thus wc may conclude that any (nonzero) number raised to the zeroth power 
is equal lo 1. (The expression 0 () is undefined.) This may be expressed as another rule: 

Rule IV ,v° = I (x 7 0) 

As long as we are concerned only with polynomial functions, only {nonnegative) integer 
powers are required. In exponential functions, however, the exponent is a variable that can 
take non integer values as well. In order to interpret a number such as .t 1 '' 2 , let us consider 
the fact that, by Rule I, we have 


Since x l/2 multiplied by itself is x, x i/2 must be the square root of x. Similarly, .v 1 ' 2 can be 
shown to be the cube root of .r. In general, therefore, we can state the following rule: 


Rule V 


M 


'■■=77 


Two other rules obeyed by exponents are 
Rule VI {x m f=x mn 


Rule VII 


x 


X /' = (XV)' 


EXERCISE 2.5 

1. Graph the functions 

(a) y=16 + 2x (b) y = 8 - 2 * (c) y = 2x + 12 

(In each case, consider the domain as consisting of nonnegative real numbers only,) 

2, What is the major difference between (a) and (b) in Prob. 1 ? How is this difference re¬ 
flected in the graphs? What is the major difference between (o) and (c)? How do their 
graphs reflect it? 
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3. Graph the functions 

(i a ) y = -x 2 + 5 x - 2 (b) y - x 2 + 5x - 2 

with the set of values -5 < x < 5 constituting the domain. It is well known that the 
sign of the coefficient of the x 2 term determines whether the graph of a quadratic func¬ 
tion will have a "hill" or a "valley." On the basis of the present problem, which sign is 
associated with the hill? Supply an intuitive explanation for this. 

4. Graph the function y = 36/x, assuming that xand yean take positive values only. Next, 
suppose that both variables can take negative values as well; how must the graph be 
modified to reflect this change in assumption? 

5. Condense the following expressions: 

(a) x 4 x x ls (b) x° x x b x x : (c ) x J x y l x £ 

6. Find: (a) x 3 /x~ 3 (b) (x 1/2 x x' /l )/x 2/l 

7. Show that x m,n = lfx™ = Specify the rules applied in each step. 

8 . Prove Rule VI and Rule VII. 


2.6 Functions of Two or More Independent Variables _ 

Thus far, we have considered only functions of a single independent variable, v = fix). 
But the concept of a function can be readily extended to the case of two or more indepen¬ 
dent variables. Given a function 


- =g(**v) 

a given pair of a and v values will uniquely determine a value of the dependent variable r. 
Such a function is exemplified by 

: - ax + by or r = i? 0 - a\x 4- cti* 2 4 b\ m v 4 A? r 

Just as the function y = f(x) maps a point in the domain into a point in the range, the 
function g will do precisely the same. However, the domain is in this ease no longer a set of 
numbers but a set of ordered pairs (x, y)> because we can determine 2 only when both x 
andy are specified. The function g is thus a mapping from a point in a two-dimensional 
space into a point on a line segment (i.e.. a point in a one-dimensional space), such as from 
the point (x\, vi) into the point 21 or from (X 2 , yi) into zi in Fig. 2.9a. 

If a vertical 2 axis is erected perpendicular to the .vy plane, as is done in diagram A, how¬ 
ever there will result a three-dimensional space in which the function g can be given a 
graphical representation as follows. The domain of the function will be some subset of 
the points in the xy plane, and the value of the function (value of 2 ) for a given point in the 
domain—say, Ui, y\) * car be indicated by the height of a vertical line planted on that 
point. The association between the three variables is thus summarized by the ordered triple 
zi), which is a specific point in the three-dimensional space. The locus of such or¬ 
dered triples, which will take the form of a surface, then constitutes the graph of the func¬ 
tion g. Whereas the function v = f\x) is a set of ordered pairs, the function r = g[x, y) 
will be a set 0 1 ordered triples. We shall have many occasions to use functions of this type 
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FIGURE 2.9 



(«> 



in economic models. One ready application is in the area of production functions. Suppose 
that output is determined by the amounts of capital (K) and labor (L) employed; then 
we can write a production function in the general form Q - Q(K, L). 

The possibility of furdier extension to the cases of three or more independent variables 
is now self-evident. With the function y = hlu, v, u:), for example, we can map a point in 
the three-dimensional space, («i. t’i, »u), into a point in a one-dimensional space (yi). 
Such a function might be used to indicate that a consumer's utility is a function of his or her 
consumption of three different commodities, and the mapping is from a three-dimensional 
commodity space into a one-dimensional utility space. But this time it will be physically 
impossible to graph the function, because for that task a four-dimensional diagram is 
needed to picture the ordered quadruples, but the world in which we live is only three- 
dimensional. Nonetheless, in view of the intuitive appeal of'geometric analogy, we can con¬ 
tinue to refer to an ordered quadruple (wj, iq, uq. vq) as a “point” in the four-dimensional 
space. The locus of such points will give the (nongraphable) "graph" of the function 
y = h(u, t;, wj), which is called a hypemirface. These terms, viz., point and liypcrsurfacc. 
are also carried over to the general case of the ^-dimensional space. 

Functions of more than one variable can be classified into various types, too. For in¬ 
stance, a function of the form 


y — U]*i + W 2 -V 2 -!-••• + 0 ,iX„ 
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is a linear function, whose characteristic is that every variable is raised to the first power 
only. A quadratic function, on the other hand, involves first and second powers of one or 
more independent variables, but the sum of exponents of the variables appearing in any sin¬ 
gle term must not exceed 2. 

Note that instead of denoting the independent variables by x t u t u, w , etc., we have 
switched to the symbols x\,X 2 >..., The latter notation, like the system of subscripted 
coefficients, has the merit of economy of alphabet, as well as of an easier accounting of the 
number of variables involved in a function. 


2.7 Levels of Generality _ 

In discussing the various types of function, wc have without explicit notice introduced 
examples of functions that pertain to varying levels of generality. In certain instances, we 
have written functions in the form 

y = 7 y = 6x + 4 y = x 2 - 3x + 1 (etc.) 

Not only are these expressed in terms of numerical coefficients, but they also indicate 
specifically whether each function is constant, linear, or quadratic. In terms of graphs, each 
such function will give rise to a well-defined unique curve. In view of the numerical nature 
of these functions, the solutions of the model based on them will emerge as numerical val¬ 
ues also, The drawback is that, if we wish to know how our analytical conclusion will 
change when a different set of numerical coefficients comes into effect, we must go through 
the reasoning process afresh each time. Thus, the results obtained from specific functions 
have very little generality. 

On a more general level of discussion and analysis, there are functions in the form 

y — a y = a -f- bx y = a + bx ex' (etc.) 

Since parameters are used, each function represents not a single curve but a whole family 
of curves. The function y = a. for instance, encompasses not only the specific cases 
y = 0. y = 1, andy = 2 but also y = j, y = -5,.,,, ad infinitum. With parametric func¬ 
tions, the outcome of mathematical operations will also be in terms of parameters. These 
results are more general in the sense that, by assigning various values to the parameters ap¬ 
pearing in the solution of the model, a whole family of specific answers may be obtained 
without having to repeal the reasoning process anew. 

In order to attain an even higher level of generality, wc may resort to the general func¬ 
tion statement y = /(x), orz = g(x,y). When expressed in this form, the function is not 
restricted to being either linear, quadratic, exponential, or trigonometric—all of which are 
subsumed under the notation. The analytical result based on such a general formulation 
will therefore have the most general applicability. As will be found below, however, in order 
to obtain economically meaningful results, it is often necessary to impose certain qualita¬ 
tive restrictions on the general functions built into a model, such as the restriction that a 
demand function have a negatively sloped graph or that a consumption function have a 
graph with a positive slope of less than 1. 

To sum up the present chapter, the structure of a mathematical economic model is 
now clear. In general, it will consist of a system of equations, which may be definitional, 


■ 
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behavioral, or in the nature of equilibrium conditions/ The behavioral equations are usu¬ 
ally in (he form of functions, which may be linear or nonlinear, numerical or parametric, 
and with one independent variable or many. It is through these that the analytical assump¬ 
tions adopted in the model are given mathematical expression. 

In attacking an analytical problem, therefore, the first step is to select the appropriate 
variables—exogenous as well as endogenous—for inclusion in the model. Next, we must 
translate into equations the set of chosen analytical assumptions regarding the human, in¬ 
stitutional, technological, legal, and other behavioral aspects of the environment affecting 
the working of the variables. Only then can wc attempt to derive a set of conclusions 
through relevant mathematical operations and manipulations and to give them appropriate 
economic interpretations. 


f Inequalities may also enter as an important ingredient of a mode), but we shall not worry about 
them for the time being. 


















Chapter 


Equilibrium Analysis 
in Economics 


The analytical procedure outlined in Chap. 2 will first be applied to what is known as static 
analysis, or equilibrium analysis. For this purpose, it is imperative first to have a clear 
understanding of what equilibrium means. 

3.1 The Meaning of Equilibrium _ 

Like any economic term. equilibrium can be defined in various ways. According to one 
definition, an equilibrium is ; 'a constellation of selected interrelated variables so adjusted 
to one another that no inherent tendency to change prevails in the model which they con¬ 
stitute Several words in this definition deserve special attention. First, the word selected 
underscores the fact that there do exist variables which, by the analyst’s choice, have not 
been included in the model. Hence the equilibrium under discussion can have relevance 
only in the context of the particular set of variables chosen, and if the model is enlarged to 
include additional variables, the equilibrium state pertaining to the smaller model will no 
longer apply 

Second, the word interrelated suggests that, in order for equilibrium to occur, all vari¬ 
ables in the model must simultaneously be in a state of rest. Moreover, the state of rest of 
each variable must be compatible with that of every other variable; otherw ise some vari¬ 
able^) will be changing, thereby also causing the others to change in a chain reaction, and 
no equilibrium can be said to exist. 

Third, the word inherent implies that, in defining an equilibrium, the state of rest in¬ 
volved is based only on the balancing of the internal forces of the model, while the exter¬ 
nal factors are assumed fixed. Operationally, this means that parameters and exogenous 
variables are treated as constants. When the external factors do actually change, there will 
be a new equilibrium defined on the basis of the new parameter values, but in defining the 
new equilibrium, the new parameter values arc again assumed to persist and stay 
unchanged. 

f Fritz Machlup, "Equilibrium and Disequilibrium: Misplaced Concreteness and Disguised Politics/' 
Economic journal, March 1958, p. 9. (Reprinted in F. Machlup, Essays on Economic Semantics, 

30 Prentice Hall Inc., Englewood Cliffs, N.|., 1963.) 
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In essence, an equilibrium for a specified model is a situation characterized by a lack of 
tendency to change, it is for this reason that the analysis of equilibrium (more specifically 
the study of what the equilibrium state is like) is referred to as statics. 

The fact that an equilibrium implies no tendency to change may tempt one to conclude 
that an equilibrium necessarily constitutes a desirable or ideal state of affairs, on the 
ground that only in the ideal state would there be a lack of motivation for change. Such a 
conclusion is unwarranted. Even though a certain equilibrium position may represent a 
desirable state and something to be striven for—such as a profit-maximizing situation, 
from the firm’s point of view—another equilibrium position maybe quite undesirable and 
therefore something to be avoided, such as an underemployment equilibrium level of 
national income. The only warranted interpretation is that an equilibrium is a situation 
which, if attained, would tend to perpetuate itself, barring any changes in the external 
forces. 

The desirable variety of equilibrium, which wc shall refer to as goal equilibrium, will be 
treated later in Part 4 as optimization problems. Tn the present chapter, the discussion will 
be confined to the nongoal type of equilibrium, resulting not from any conscious aiming at 
a particular objective but from an impersonal or suprapersonal process of interaction and 
adjustment of economic forces. Examples of this arc the equilibrium attained by a market 
under given demand and supply conditions and the equilibrium of national income under 
given conditions of consumption and investment patterns, 

3,2 Partial Market Equilibrium—A Linear Model _ 

in astatic-equilibrium model, the standard problem is that of finding the set of values of the 
endogenous variables which will satisfy the equilibrium condition of the model. This is 
because once we have identified those values, wc have in effect identified the equilibrium 
state. Let us illustrate with a so-called partial-equilibrium market model, i.e., a model of 
price determination in an isolated market. 

Constructing the Model 

Since only one commodity is being considered, it is necessary to include only three vari¬ 
ables in the model; the quantity demanded of the commodity (Q d \ the quantity supplied 
of the commodity (£),.), and its price (P). The quantity is measured, say, in pounds per 
week, and the price in dollars. Having chosen the variables, our next order of business is 
to make certain assumptions regarding the working of the market. First, we must specify 
an equilibrium condition -something indispensable in an equilibrium model. The stan¬ 
dard assumption is that equilibrium occurs in the market if and only if the excess demand 
is zero (Qd - Q s = 0), that is, if and only if the market is cleared. But this immediately 
raises the question of how Q d and Q. themselves are determined. To answer this, we 
assume that Q d is a decreasing linear function of P (as P increases, Q d decreases). On 
the other hand, £>, is postulated to be an increasing linear function of P (as P increases, 
so does £> y ), with the proviso that no quantity is supplied unless the price exceeds a par¬ 
ticular positive level. In all, then, the model will contain one equilibrium condition 
plus two behavioral equations which govern the demand and supply sides of the market, 
respectively. 
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FIGURE 3.1 



Translated into mathematical statements, the model can be written as 


Q* = Q S 

Q d = a-bP ( a,b>0 ) (3.1) 

Q, = -c + dP (c, d > 0) 

Four parameters, a, b , c, and d s appear in the two linear functions, and all ofthem are spec¬ 
ified to be positive. When the demand function is graphed, as in Fig. 3.1, its vertical inter¬ 
cept is at a and its slope is —b, which is negative, as required. The supply function al so has 
the required type of slope, d being positive, but its vertical intercept is seen to be negative, 
at -c. Why did wc want to specify such a negative vertical intercept? The answer is that, in 
so doing, we force the supply curve to have a positive horizontal intercept at P,, thereby sat¬ 
isfying the proviso slated earlier that supply will not be forthcoming unless the price is pos¬ 
itive and sufficiently high. 

The reader should observe that, contrary to the usual practice, quantity rather than price 
has been plotted vertically in Fig. 3.1. This, however, is in line with the mathematical con¬ 
vention of placing the dependent variable on the vertical axis. In a different context in 
which the demand curve is viewed from the standpoint of a business firm as describing the 
average-revenue curve, AR = P =J\Q d ), we shall reverse the axes and plot P vertically. 

With the model thus constructed, the next step is to solve it, i.c„ to obtain the solution 
values of the three endogenous variables. Q d , Q, and P. The solution values are those 
values that satisfy the three equations in (3.1) simultaneously; i.e., they are the values 
which, when substituted into the three equations, make the latter a set of ti - ue statements. In 
the context of an equilibrium model, those values may also be referred to as the equilibrium 
values of the said variables. 

Many writers employ no special symbols to denote the solution values of the endoge¬ 
nous variables. Thus, Q d is used to represent cither the quantity-demanded variable (with a 
whole range of values) or its solution value (a specific value); and similarly for the symbols 
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Q s and P. Unfortunately, this practice can give rise to possible confusions, especially in the 
context of comparative-static analysis (e.g., Sec. 7.5). To avoid such a source of confusion, 
we shall denote the solution value of an endogenous variable with an asterisk. Thus, the 
solution values of Q d> <2 ( , and P, are denoted by Q d , Q*, and P\ respectively. Since 
Q*i = QU however, they can even be replaced by a single symbol Q*. Hence, an equilib¬ 
rium solution, of the model may simply be denoted by an ordered pair ( P*, Q*). I n case the 
solution is not unique, several ordered pairs may each satisfy the system of simultaneous 
equations; there will then be a solution set with more than one element in it. However, the 
multiple-equilibrium situation cannot arise in a linear model such as the present one. 


Solution by Elimination of Variables 

One way of finding a solution to an equation system is by successive elimination of vari¬ 
ables and equations through substitution. In (3.1), the model contains three equations in 
three variables. However, in view of the equating of Q d and Q, by the equilibrium condition, 
we can let Q = Qj = and rewrite the model equivalently as follows: 


Q = a - bP 
Q = -c + dP 


(3.2) 


thereby reducing the model to two equations in two variables. Moreover, by substituting the 
first equation into the second in (3.2), the model can be further reduced to a single equation 
in a single variable: 


a-bP = -c-V dP 

or, after subtracting (a + dP) from both sides of the equation and multiplying through 

by-1, 


(b + d)P = a+c (33) 

This result is also obtainable directly from (3.1) by substituting the second and third equa¬ 
tions into the first. 

Since b + d ^ 0, it is permissible to divide both sides of (3.3) by (b + d). The result is 
the solution value of P: 


p , f_+£ 

b + d 


(3.4) 


Note that P* is—as all solution values should be—expressed entirely in terms of the 
parameters, which represent given data for the model. Thus P' is a determinate value, as 
it ought to be. Also note that P* is positive—as a price should be—because all the four 
parameters are positive by model specification. 

To find the equilibrium quantity Q* (= Q* d = Q‘) that corresponds to the value P r . 
simply substitute (3.4) into either equation of (3.2), and then solve the resulting equation. 
Substituting (3.4) into the demand function, for instance, we can get 


b(a + c) 


b+d 


a(b + d) — b(a + c) 


ad — be 

T+~ 


b + d 


(3.5) 
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which is again an expression in terms of parameters only. Since the denominator (b + J) is 
positive, the positivity of Q' requires that the numerator {ad - be) be positive as well. 
Hence, to be economically meaningful, the present model should contain the additional 
restriction that ad > be. 

The meaning of this restriction can be seen in Fig. 3.1. It is well known that the P* and 
Q* of a market model may be determined graphically at the intersection of the demand and 
supply curves. To have Q* > 0 is to require the intersection point to be located above the 
horizontal axis in Fig. 3.1, which in turn requires the slopes and vertical intercepts of the 
two curves to fulfill a certain restriction on their relative magnitudes. Thai restriction, 
according to (3.5), is ad > bt\ given that both b and d are positive. 

The intersection of the demand and supply curves in Fig. 3.1, incidentally, is in concept 
no different from the intersection shown in the Venn diagram of Fig. 2.2 h. There is one dif¬ 
ference only: Instead of the points lying within two circles, the present case involves the 
points that lie on two lines. Let the set of points on the demand and supply curves be 
denoted, respectively, by D and S. Then, by utilizing the symbol Q (= Q ( \ — £),), the two 
sets and their intersection can be written 

D=[(P, Q)\Q = a-bP } 

S=HP, Q)\Q = -c + dP I 
and bDS={P*,Q') 

The intersection set contains in this instance only a single element, the ordered pair 
(P‘, Q*). The market equilibrium is unique. 


EXERCISE 3.2 

1. Given the market model 
Qa " Q s 

Qd = 21 - 3P 
Q s =—4 + 8P 

find P* and 0* by (a) elimination of variables and (b) using formulas (3.4) and (3.5), 
(Use fractions rather than decimals.) 

2. Let the demand and supply functions be as follows: 

(a) Qd - 51 - 3P (b) Qd = 30 - 2 P 

Q ( - 6P - 10 Q, = -6 + 5 P 

find P* and Q ' by elimination of variables. (Use fractions rather than decimals.) 

3. According to (3.5), for Q* to be positive, it is necessary that the expression (ad - be) 
have the same algebraic sign as (b + d). Verify that this condition is indeed satisfied in 
the models of Probs. 1 and 2. 

4. If (b+d) = 0 in the linear market-model, can an equilibrium solution be found by 
using (3.4) and (3.5)? Why or why not? 

5. If (b + d) = 0 in the linear market model, what can you conclude regarding the posi¬ 
tions of the demand and supply curves in Fig. 3.1? What can you conclude, then, 
regarding the equilibrium solution? 
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3,3 Partial Market Equilibrium—A Nonlinear Model _ 

Let the linear demand in the isolated market model he replaced by a quadratic demand 
function, while the supply function remains linear. Also, let us use numerical coefficients 
rather than parameters. Then a model such as the following may emerge; 

Q.d = Qs 

Qd = 4 - P- 1 ( 3 . 6 ) 

C/ = 4P-1 

As previously, this system of three equations ean be reduced to a single equation by elimi¬ 
nation of variables (by substitution): 

4 - P l = 4P — 1 
or 

P*+4P-5 = 0 (3.7) 

This is a quadratic equation because the left-hand expression is a quadratic function of vari¬ 
able P. A major difference between a quadratic equation and a linear one is that, in general, 
the former will yield two solution values. 

Quadratic Equation versus Quadratic Function 

Before discussing the method of solution, a clear distinction should be made between the 
two terms quadratic equation and quadratic Junction. According to the earlier discussion, 
the expression P 2 +4P - 5 constitutes a quadratic Junction, say, f(P). Hence we may write 

f(P) = P 2 + 4P -5 ( 3 . 8 ) 

What (3.8) does is to specify a rule of mapping from P to/(F). such as 


p 


-6 

-5 

1 -4 

-3 

-2 

-1 

! o 

1 

2 


m . 


7 

0 

-5 

-8 

-9 

-8 

-5 | 

0 

7 

, . . 


Although we have listed only nine P values in this table, actually all the P values in the do¬ 
main of the function are eligible for listing. It is perhaps for this reason that we rarely speak 
of "solving’’ the equation f{P) = P 2 +4P ~ 5, because we normally expect “solution 
values” to be few in number, but here all P values can get involved. Nevertheless, one may 
legitimately consider each ordered pair in the table—such as (-6, 7) and (-5, ())—as a so¬ 
lution of (3.8), since each such ordered pair indeed satisfies that equation. Inasmuch as an 
infinite number of such ordered pairs can be written, one for each P value, there is an infi¬ 
nite number of solutions to (3.8). When plotted as a curve, these ordered pairs together 
yield the parabola in Fig. 3.2. 

In (3.7), where we set the quadratic function/(P) equal to zero, the situation is funda¬ 
mentally changed. Since the variable J(P) now disappears (having been assigned a 
zero value), the result is a quadratic equation in the single variable P. f Now that f(P) is 

1 The distinction between quadratic function and quadratic equation just discussed can be extended 
also to cases of polynomials other than quadratic. Thus, a cubic equation results when a cubic 
function is set equal to zero. 
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FIGURE 3.2 



restricted to a zero value, only a select number of P values can satisfy (3.7) and qualify as 
its solution values, namely, those P values at which the parabola in Fig. 3.2 intersects the 
horizontal axis- on which f(P) is zero. Note that this time the solution values are just P 
values, not ordered pairs. The solution P values are often referred to as the roots of the qua¬ 
dratic equation f(P) = 0, or, alternatively, as the zeros of the quadratic function f{P). 

There are two such intersection points in Fig. 3.2, namely, (1, 0) and (-5, 0). As re¬ 
quired, the second element of each of these ordered pairs (the ordinate of the correspond¬ 
ing point) shows f(P) = 0 in both cases. The first element of each ordered pair (the 
abscissa of the point), on the other hand, gives the solution value of P. Here we get two 
solutions, 

p; = 1 and P! = -5 

but only the first is economically admissible, as negative prices arc ruled out. 


The Quadratic Formula 

Equation (3.7) has been solved graphically, but an algebraic method is also available. In 
general, given a quadratic equation in the form 

ax 2 + bx+c = 0 (a 5*0) (3.9) 


there are two roots, which can be obtained from the quadratic formula: 

. , -b±{b 1 -Aac)'! 1 


(3.10) 


where the + part of the ± sign yields x\ and the - part yields xf 

Also note that as long as b 2 - 4 ac > 0, the values of xj and x\ would differ, giving us 
two distinct real numbers as the roots. But in the special case where b 2 - 4 ac = 0, we 
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would find that x J - = x 2 * = —bjla. In this case, the two roots share the identical value; they 
arc referred to as repeated routs. In yet another special case where b 2 — Aae < 0, we would 
have the task of taking the square root of a negative number, which is not possible in the 
real-number system. In this latter case, no real-valued roots exist. We shall discuss this 
matter further in Sec. 16.1. 

This widely used formula is derived by means of a process known as "completing the 
square.” First, dividing each term of (3.9) by a results in the equation 

i b c 

x 2 -f -x + - = 0 
a a 

Subtracting c/a from, and adding b 2 /Aa 2 to, both sides of the equation, we get 

, A b 2 b 2 c 


r + -x 
a 


Aa 1 4 a 2 a 

The left side is now a ‘‘perfect square,” and thus the equation can be expressed as 

l b\ 2 h l -Aac 


x + 


1a 


4a 2 


or, after taking the square root on both sides, 


b (b 2 - 4ac) 1 ' 2 

x + ^- = ±- - =— L ~ 

la 2 a 

Finally, by subtracting b/2a from both sides, the result in (3.10) is obtained. 

Applying the formula to (3.7), where a = 1, b = 4, c = -5, and x = P, the roots are 
found to be 


which check with the graphical solutions in Fig. 3.2. Again, we reject = -5 on eco¬ 
nomic grounds and, after omitting the subscript 1, write simply P* = 1. 

With this information in hand, the equilibrium quantity Q* can readily be found from 
either the second or the third equation of (3.6) to be Q* - 3. 

Another Graphical Solution 

One method of graphical solution of the present model has been presented in Fig. 3.2. 
However, since the quantity variable has been eliminated in deriving the quadratic equa¬ 
tion, only P" can be found from that figure. If we are interested in finding P* and Q* 
simultaneously from a graph, we must instead use a diagram with Q on one axis and P on 
the other, similar in construction to Fig. 3.1. This is illustrated in Fig. 3.3. Our problem is 
of course again to find the intersection of two sets of points, namely, 

D = i{P, Q)\Q = 4-P 2 } 

s = {{P, Q)\Q = 4P-\) 


and 
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If no restriction is placed on the domain and the range, the intersection set will contain two 
elements, namely, 

D H 5 = {(1,3), (-5, -21)) 

The former is located in quadrant 1, and the latter (.not drawn) in quadrant III. If the domain 
and range are restricted to being nonnegative, however, only the first ordered pair (1,3) can 
be accepted. Then the equilibrium is again unique. 

Higher-Degree Polynomial Equations 

If a system of simultaneous equations reduces not to a linear equation such as (3.3) f or to 
a quadratic equation such as (3.7) but to a cubic (third-degree polynomial) equation or 
quartic (fourth-degree polynomial) equation, the roots will be more difficult to find. One 
useful method which may work is that of factoring the function. 


Example 1 


The expression x l - x 2 - 4x + 4 can be written as the product of three factors (x - 1), 
(x + 2), and (x - 2). Thus the cubic equation 

x 3 - x 1 - 4x +4 = 0 

can be written after factoring as 

(x-1)(jf + 2)(x-2) = 0 

In order for the left-hand produetto be zero, at least one of the three terms in the product 
must be zero. Setting each term equal to zero in turn, we get 

x-1=0 or x + 2 = 0 or x-2 = 0 

These three equations will supply the three roots of the cubic equation, namely, 

4 = 1 y\ = -2 and 4 = 2 


1 Equation (3.3) can be viewed as the result of setting the linear function (t>+ d)P - (o + c) equal to 
zero. 
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Example 1 illustrates two interesting and useful facts about factoring, First, given a 
third-degree polynomial equation, factoring results in three terms of the form (x — root), 
thus yielding three roots. Generally, an /?th-degree polynomial equation should yield a total 
of n roots. Second, and more important for the purpose of root search, we note the follow¬ 
ing relationship between the three roots (L, —2, 2) and the constant term 4: Since the con¬ 
stant term must be the product of the three roots, each root must be a divisor of the constant 
term. This relationship can be formalized in the following theorem: 

Theorem I Given the polynomial equation 

x n + |A*' r 1 + • • * + ci]X 4- at) = 0 

where all the coefficients are integers, and the coefficient of.t“ is unity, if there exist inte¬ 
ger roots, then each of them must be a divisor of ciq. 

Sometimes, however, we encounter fractional coefficients in the polynomial equation, 
as in 

-T 4 + 5 * 3 - yjr 2 - I Os + 6 = 0 

which does not fall under the provision of Theorem I. Even if we multiply through by 2 to 
get rid of the fractions (ending in the form shown in Example 2 which follows), we still 
cannot apply Theorem I, because the coefficient of the highest-degree term is not unity. In 
such cases, we can resort to a more general theorem: 

Theorem II Given the polynomial equation with integer coefficients 

+ + •• • + d)X +a<j =0 

if there exists a rational root r/s, where r and s are integers without a common divisor 
except unity, then r is a divisor of no, and 5 is a divisor of a„. 


Example 2 


Does the quartic equation 

2x 4 + - 11 x 2 - 20x + 12 = 0 


have rational roots? With a o = 12, the only possible values for the numerator r in r/s are the 
set of divisors {1, -1,2, -2, 3, -3,4, -4,6, -6,12, -12f. And, with o„ = 2, the only possi¬ 
ble values tors are the set of divisors [1, -1,2, -2}. Taking each element in the r set in turn, 
and dividing it by each element in the s set, respectively, we find that r/s can only assume 
the values 


1. -1, \, 1, -2, 3, -3, 1 4, -4, 6, -6, 12, -12 

Among these candidates for roots, many fail to satisfy the given equation. Letting x = 1 in 
the quartic equation, for instance, we get the ridiculous result -12 = 0. In fact, since we are 
solving a quartic equation, we can expect at most four of the listed r/s values to qualify as 
roots. The four successful candidates turn out to be 1, 2, -2, and -3. According to the 
factoring principle, we can thus write the given quartic equation equivalently as 

{x-l)(x~2)(x+2)(x + 3)-0 

where the first factor can also be written as (2x — 1) instead. 
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In Example 2, we rejected the root candidate 1 because x = I fails to satisfy the given 
equation; i.e.. substitution of a = 1 into the equation does not produce the identity 0 = 0 
as required. Now consider the case where a = 1 indeed is a root of some polynomial equa¬ 
tion. In that case, since x n = a" -1 = • • • = a = 1. the polynomial equation would reduce 
to the simple form <+ a„ _ i + —hm + o,j - 0. This fact provides the rationale for the 
following theorem: 

Theorem III Given the polynomial equation 

lf„+ Ci,i-\x" 1 + • - • + Cl]X + d() = U 

if the coefficients a,„ «„_] __ «« add up to zero, then x = 1 is a root of the equation. 


EXERCISE 3.3 

1. Find the zeros of the following functions graphically: 

(o) f(*) = x 2 -8x+15 (b)g(x) = 2 x 2 -4x--\6 

2. Solve Prob. 1 by the quadratic formula. 

3. (o) Find a cubic equation with roots 6, -1, and 3. 

(b) Find a quartic equation with roots 1, 2, 3, and 5. 

4. For each of the following polynomial equations, determine if a = 1 is a root. 

(o)x 3 -2x 2 - 3x-2 = 0 (c) 3* 4 -a 2 + 2*-4 = Q 

(b) 2x l - \x 2 + x -2 = 0 

5. Find the rational roots, if any, of the following: 

(a) x 2 - 4x 2 - x+ 6 =0 (c) x 3 -t- |a 2 - §x - j = 0 

(b) 8x 3 + 6x 2 - 3* - 1 = 0 (rf) x 4 - 6x l + 7\x 2 -\x-2 = 0 

6. Find the equilibrium solution for each of the following models: 

(a) Qd = Q, (&) Qd = Os 

Q d = 3-P 2 Q a = 8-P 2 

Q S = 6P -4 Q s = P 2 -2 

7. The market equilibrium condition, Q<j = Q 5 , is often expressed in an equivalent alter¬ 
native form, Qd - Qj = 0, which has the economic interpretation "excess demand is 
zero." Does (3.7) represent this latter version, of the equilibrium condition? If not, sup¬ 
ply an appropriate economic interpretation for (3.7). 


3.4 General Market Equilibrium _ 

The last two sections dealt with models of an isolated market, wherein the Q t i and Q, of a 
commodity are functions of the price of that commodity alone. In the actual world, though, 
no commodity ever enjoys (or suffers) such a hcrmitic existence; for every commodity, 
there would normally exist many substitutes and complementary goods. Thus a more real¬ 
istic depiction of the demand function of a commodity should take into account the effect 
not only of the price of the commodity itself but also of the prices of related commodities. 
The same also holds true for the supply function. Once the prices of other commodities are 
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broughl into the picture, however, the structure of the model itself must be broadened so as to 
be able to yield the equilibrium values of these other prices as well. As a result, the price and 
quantity variablcsofmultiple commodities must enter endogenously into the model en masse. 

In an isolated-market model, the equilibrium condition consists of only one equation. 
Qj = Q,. or E = Qj — Q, = 0. where E stands for excess demand. When several inter¬ 
dependent commodities are simultaneously considered, equilibrium would require the 
absence of excess demand for each and every commodity included in the model, for if so 
much as one commodity is faced with an excess demand, the price adjustment of that com¬ 
modity will necessarily affect the quantities demanded and quantities supplied of the 
related commodities, thereby causing price changes all around. Consequently, the equilib¬ 
rium condition of an ^-commodity market model will involve n equations, one I'or each 
commodity, in the form 

Ei = Q li{ -Q si = 0 (/ = 1 , 2 ,...,*) ( 3 , 11 ) 

If a solution exists, there will be a set of prices P* and corresponding quantities Q* such 
that all the n equations in the equilibrium condition will be simultaneously satisfied. 


Two-Commodity Market Model 

To illustrate the problem, let us discuss a simple model in which only two commodities are 
related to each other. For simplicity, the demand and supply functions of both commodities 
arc assumed to be linear. In parametric terms, such a model can be written as 

Qj) - Q*\ = 0 
Qd i = tfo + ct\ P) + & 2 P 2 
1 = bo + b{P{ + biPi 

Qdi - 0,2 = 0 ( * } 

Qdi = u 0 + 0C2P2 
Q,2 = Po + PlP\+p2P2 


where the a and b coefficients pertain to the demand and supply functions of the first com¬ 
modity. and the a and p coeilicienis arc assigned to those of the second. We have noi bolh- 
ered to specify the signs of the coefficients, but in the course of analysis certain restrictions 
will emerge as a prerequisite to economically sensible results, Also, in a subsequent numer¬ 
ical example, some comments will be made on the specific signs to be given the coefficients. 

As a first step toward the solution of this model, we can again resort to elimination of 
variables. By substituting the second and third equations into the first (for the first com¬ 
modity) and the fifth and sixth equations into the fourth (for the second commodity), the 
mode! is reduced to two equations in two variables: 

(ciq — bo) + (tfi - b\ )P] + (a 2 — bi)P2 = 0 

(«o — jSo) H- (oti - fl\)P) + (ai- fti)P2 = 0 

These represent the two-commodity version of (3.11), after the demand and supply func¬ 
tions have been substituted into the two equilibrium conditions. 

Although this is a simple system of only two equations, as many as 12 parameters are 
involved, and algebraic manipulations will prove unwieldy unless some sort of shorthand 
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is introduced. Let us therefore define the shorthand symbols 


c; sa t - k 

y, =a, - ^ 


0 = 0. U) 


Then, after transposing the co and y 0 terms to the right-hand side, we get 


ClP\ + c 2 Pi = —co 


Y\P\ + Yih = ~Yo 


(3.13') 


which may be solved by further elimination of variables. From the first equation, it can be 
found that P; = -(<r 0 + qPi}/c 2 . Substituting this into the second equation and solving, 
we get 


p, = c m - c ore 

1 C\Yi - C->_Y\ 


(3.14) 


Note that P* is entirely expressed, as a solution value should be, in terms of the data 
(parameters) of the model. By a similar process, the equilibrium price of the second com¬ 
modity is found to be 


P* = 


epyi ~c\Y\) 

Cl Y2 ~ CiY\ 


(3.15) 


For these two values to make sense, however, certain restrictions should be imposed on the 
model. First, since division by zero is undefined, we must require the common denomina¬ 
tor of (3.14) and (3.15) to be nonzero, that is, c\yi # C' 2 Pi . Second, to assure positivity, the 
numerator must have the same sign as the denominator. 

The equilibrium prices having been found, the equilibrium quantities Q\ and Q\ can 
readily be calculated by substituting (3.14) and (3.15) into the second (or third) equation 
and the fifth (or sixth) equation of (3.12). These solution values will naturally also be ex¬ 
pressed in terms of the parameters. (Their actual calculation is left to you as an exercise.) 


Numerical Example 

Suppose that the demand and supply functions are numerically as follows: 

Qd 1 — 10 — 2 Pi 4- P2 
&,=-2 + 3P, 

Qdi = 1 5 + P ~ h 
Q ,2 = ~ 1 +2 Pi 


(3.16) 


What is the equilibrium solution? 

Before answering the question, let us take a look at the numerical coefficients. For each 
commodity, Q si is seen to depend on P, alone, but Q,u is shown as a function of both 
prices. Note that while Pi has a negative coefficient in Q d 1 , as we would expect, the coef¬ 
ficient of P 2 is positive. The fact that a rise in Pi tends to raise Q d \ suggests that the two 
commodities are substitutes for each other. The role of Pi in the Qdi function has a similar 
interpretation. 

With these coefficients, the shorthand symbols c, and y, will take the following values: 
c 0 = 10 — (—2) = 12 c, = —2 — 3 = -5 c 2 = 1-0 = 1 

y 0 = 15 - (-1) = 16 y, = 1-0=1 y, = -l-2 = -3 
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By direct substitution of these into (3.14) and (3.15), we obtain 


n* _ 52 _ 'x 5 
M - " '*7 


and 


/>* - ^ - Ai 
r 2 “ 14 - °7 


And the further substitution of /y and P 2 * into (3.16) yields 


Gi = ¥ = »7 and Cj = T = 12t 


Thus all the equilibrium values turn out positive, as required. In order to preserve the exact 
values of P* and P 2 * to be used in the further calculation of Q\ and Q*, it is advisable to 
express them as fractions rather than decimals. 

Could we have obtained the equilibrium prices graphically? The answer is yes. from 
(3.13), it is clear that a two-commodity model can be summarized by two equations in two 
variables P\ and P 2 . With known numerical coefficients, both equations can be plotted in the 
P 1 Pi coordinate plane, and the intersection of the two curves will then pinpoint P‘ and P*. 


n-Commodity Case 

The previous discussion of the multicommodity market has been limited to the ease of 
two commodities, but it should be apparent that we are already moving from pariial- 
equilibrimi analysis in the direction of general-equilibrium analysis. As more commodities 
enter into a model, there will be more variables and more equations, and the equations will 
get longer and more complicated. If all the commodities in an economy are included in a 
comprehensive market model, the result will be a Walrasian type of general-equilibrium 
model, in which the excess demand for every commodity is considered to be a function of 
the prices of all the commodities in the economy. 

Some of the prices may, of course, carry zero coefficients when they play no role in the 
determination of the excess demand of a particular commodity; e.g., in the excess-demand 
function of pianos the price of popcorn may well have a zero coefficient. In general, how¬ 
ever, with n commodities in all, we may express the demand and supply functions as 
follows (using Qji and Q,, as function symbols in place of/and g): 


Qa = Q<u(Pu Pi . P») 

Q» = Qsi(Pi< Pi, ■■■,?„) 


(3.17) 


In view of the index subscript, these two equations represent the totality of the 2 n functions 
which the model contains. (These functions arc not necessarily linear.) Moreover, the equi¬ 
librium condition is itself composed of a set of n equations. 


Qdi-Q si = 0 (/ = !, 2,...,«) (3.18) 


When (3.1b) is added to (3.L7), the model becomes complete. You should therefore count a 
total of in equations. 

Upon substitution of (3.17) into (3.18), however, the model can be reduced to a set of n 
simultaneous equations only: 


Qj,(Pt, Pi ,.... Pn) ~ QA p u p 2 , - •• Pn) = 0 (i = 1,2,.. .,n) 

Besides, inasmuch as £,■ = Qai - Q Si , where E, is necessarily also a function of al I the « 
prices, the latter set of equations may be written alternatively as 


£ i {P u P h ...,P n )=Q (i = 1,2,..., 
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Solved simultaneously, these n equations can determine the n equilibrium prices Pj—if 
a solution docs indeed exist. And then the Q* may be derived from the demand or supply 
functions, 

Solution of a General-Equation System 

If a model comes equipped with numerical coefficients, as in (3.16), the equilibrium values ot 
the variables will be in numerical terms, too. On a more general level, if a model is expressed 
in terms ofparametric constants, as in (3.12), the equilibrium values will also involve param¬ 
eters and will lienee appear as “formulas,” as exemplified by (3.14) and (3.15). If, for greater 
generality, even the function forms are left unspecified in a model, however, as in (3.17). the 
manner of expressing the solution values will of necessity be exceedingly general as well. 

Drawing upon our experience in parametric models, we know that a solution value is al¬ 
ways an expression in terms of the parameters. For a general-function model containing, 
say, a total of m parameters («i, ui,. , a m )—where m is not necessarily equal ton then 
equilibrium prices can be expected to take the general analytical form of 

p,' = p;( ai ,a 2 . a„) (f = 1,2. n) (3.19) 

This is a symbolic statement to the effect that the solution value of each variable (here, 
price) is a function of the set of all parameters of the model. As this is a very general state¬ 
ment, it really does not give much detailed information about the solution. But in the gen¬ 
eral analytical treatment of some types of problem, even this seemingly uninformative way 
of expressing a solution will prove of use, as will be seen in Chap. 8. 

Writing such a solution is an easy task. But an important catch exists: the expression in 
(3,19) can be justified if and only if a unique solution does indeed exist, for then and only 
then can we map the ordered m-tuplc (ai,a 2 ,..., a m ) into a determinate value for each 
price P*. Yet, unfortunately for us, there is no a priori reason to presume that every model 
will automatically yield a unique solution. Tn this connection, it needs to be emphasized 
that the process of “counting equations and unknowns" does not suffice as a lest. Some 
very simple examples should convince us that an equal number of equations and unknowns 
(endogenous variables) does not necessarily guarantee the existence of a unique solution. 

Consider the three simultaneous-equation systems 


x + y = 8 
x + y = 9 

(3.20) 

2a + y = 12 

4a + 2 y - 24 

(3.21) 

2a + 3 v - 58 


y = 18 

(3.22) 

a+ y = 20 



In (3.20), despite the fact that two unknowns are linked together by exactly two equations, 
there is nevertheless no solution. These two equations happen to be inconsistent, for if the 
sum of,randy is 8, it cannot possibly be 9 at the same time. In (3.21), another case of two 
equations in two variables, the two equations are functionally dependent, which means that 
one can be derived from (and is implied by) the other. (Here, the second equation is equal 
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to two times the first equation.) Consequently, one equation is redundant and may be dropped 
from the system, leaving in effect only one equation in two unknowns. The solution will then 
be the equation y = 12 — 2x. which yields not a unique ordered pair (.r\ v‘) but an infinite 
number of them, including (0, 12), (I, 10), {2, 8), etc., all of which satisfy that equation. 
Lastly, the case of (3.22 > involves more equations than unknowns, yet the ordered pair (2 v 18) 
does constitute the unique solution to it. The reason is that, in view of the existence of func¬ 
tional dependence among the equations (the first is equal to the second plus twice the third), 
we have in effect only two independent, consistent equations in two variables. 

These simple examples should suffice to convey the importance of consistency and func¬ 
tional independence as the two prerequisites for application of the process of counting 
equations and unknowns. In general, in order to apply that process, make sure that (1 > the 
satisfaction of any one equation in the model will not preclude the satisfaction of another 
and (2) no equation is redundant. In (3.17), for example, the n demand and n supply func¬ 
tions may safely be assumed to be independent of one another, each being derived from a 
different source—each demand from the decisions of a group of consumers, and each sup¬ 
ply from the decisions of a group of firms. Thus each function serves to describe one facet 
of the market situation, and none is redundant. Mutual consistency may perhaps also be 
assumed. In addition, the equilibrium-condition equations in (3.18) arc also independent 
and presumably consistent. Therefore the analytical solution as written in (3.19) can in 
general be considered justifiable/ 

For simultaneous-equation models, there exist systematic methods of testing the exis¬ 
tence of a unique for determinate) solution. These would involve, for linear models, an 
application of ihe concept of determinants, to be introduced in Chap. 5. In the case of non¬ 
linear models, such a test would also require a knowledge of so-called partial derivatives 
and a special type of determinant cdUcd the Jacobian determinant, which will be discussed 
in Chaps. 7 and 8. 


EXERCISE 3.4 

1. Work out the step-by-step solution of (3.13'); thereby verifying the results in (3.14) 
and (3.15). 

2. Rewrite (3.14) and (3.15) in terms of the original parameters of the modei in (3.12). 

3. The demand and supply functions of a two-commodity market model are as follows: 

Q rf1 =18-3Pi+P 2 Q<c«12+Pi-2* 

Q 3 i = -2 +4 Pi Qs2 = -2 +3 P 2 

Find Pf and Of (/ = 1,2). (Use fractions rather than decimals.) 


* This is essentially the way that Leon Walras approached the problem of the existence of 
a general-market equilibrium, in the modern literature, there can be found a number of 
sophisticated mathematical proofs of the existence of a competitive market equilibrium under 
certain postulated economic conditions. But the mathematics used is advanced. The easiest one 
to understand is perhaps the proof given in Robert Dorfman, Paul A. Samuelscn, and Robert M. 
Solow, Linear Programming and Economic Analysis, McGraw-Hill Book Company New York, 195S, 
Chap. 13. 
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3.5 Equilibrium in National-Income Analysis 


Even though the discussion of static analysis has hitherto been restricted to market models 
in various guises—linear and nonlinear, onc-eommodity and multicommodity, specilic and 
general it. of course, has applications in other areas of economics also. As an example, 
vve may cite the simplest Keynesian national-income model, 


Y = C 4- 1a + (?o 
C = a+hY 


(a > 0, 0 < b < 1) 


(3.23) 


where Y and C stand foT the endogenous variables national income and (planned) con¬ 
sumption expenditure, respectively, and Iq and G'y represent the exogenously determined 
investment and government expenditures. The first equation is an equilibrium condition 
(national income = total planned expenditure). The second, the consumption function, is 
behavioral. The two parameters in the consumption function, a and 6, stand for the au¬ 
tonomous consumption expenditure and the marginal propensity to consume, respectively. 

It is quite clear that these two equations in two endogenous variables are neither func¬ 
tionally dependent upon, nor inconsistent with, each other. Thus wc would be able to find 
the equilibrium values of income and consumption expenditure, P and C", in terms of the 
parameters a and b and the exogenous variables 1$ and fin. 

Substitution of the second equation into the first will reduce (3.23) to a single equation 
in one variable, Y: 


Y = a + bYI(t + Gn 

or (1 - b)Y = « + /o + Go (collecting terms involving Y) 

To find the solution value of Y (equilibrium national income), we only have to divide 
through by (1 - b): 


a -+■ A) + Go 


(3.24) 


Note, again, that the solution value is expressed entirely in terms of the parameters and ex¬ 
ogenous variables, the given data of the model. Putting (3.24) into the second equation ot 
(3.23) will then yield the equilibrium level of consumption expenditure: 


C* =a + bY* =a 


b{a + /(i + Gu) 
1 - b 


</(l — b) + b(a + !(, + Go) a + b{ltj -T Gy) 


1-6 


1 -6 


(3.25) 


This is again expressed entirely in terms of the given data. 

Both Y* and C* have the expression (1 - 6) in the denominator; thus a restriction 6 ^ 1 
is necessary, to avoid division by zero. Since 6, the marginal propensity to consume, has been 
assumed to be a positive fraction, this restriction is automatically satisfied. For P andC* to 
be positive, moreover, the numerators in (3.24) and (3.25) must be positive. Since the exoge¬ 
nous expenditures / 0 and Co are normally positive, as is the parameter a (the vertical inter¬ 
cept of the consumption function), the sign of the numerator expressions will work out, too. 
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As a check on our calculation, we can add the C’ expression in (3.25) to (4 4- 6’{i) and 
verify that the sum is equal to the P expression in (3.24). 

This model is obviously one of extreme simplicity and crudity, but other models of 
national-income determination, in varying degrees of complexity and sophistication, can 
be constructed as well. In each case, however, the principles involved in the construction 
and analysis of the model are identical with those already discussed. For this reason, we 
shall not go into further illustrations here. A more comprehensive national-income model, 
involving the simultaneous equilibrium of the money market and the goods market, will be 
discussed in Sec. 8.6. 


EXERCISE 3.5 

1. Given the following model: 

Y = C — /o — Go 

C = o + b(f-T) (o > 0, 0 < ft < 1) [T: taxes] 

T = d+tY (d > 0, 0 < f <• 1) [f: income tax rate] 

(a) How many endogenous variables are there? 

(b) Find P, r, and C*. 

2. Let the rational-income model be: 

Y = C + l 0 + G 

C=a-b(J'-T 0 ) (a > 0, 0 < 6 < 1) 

G = gY (0 < g < 1) 

(a) Identify the endogenous variables. 

(b) Give the economic meaning of the parameter g. 

(c) Find the equilibrium national income. 

(d) What restriction on the parameters is needed for a solution to exist? 

3. Find Y m and C* from the following: 

Y = C + fo + Co 
C - 25 + 6 V' 1 -'' 2 
/ 0 = 16 
G 0 = 14 



Chapter 


Linear Models and 
Matrix Algebra 

For the one-commodity model (3.1), the solutions P" and Q* as expressed in (3.4) 
and (3.5), respectively, arc relatively simple, even though a number of parameters arc 
involved. As more and more commodities are incorporated into the model, such solution 
formulas quickly become cumbersome and unwieldy. That was why we had to resort to a 
little shorthand, even foT the two-commodity case—in order that the solutions (3.14) 
and (3.15) can still be written in a relatively concise fashion. Wc did not attempt to tackle 
any three- or four-commodity models, even in the linear version, primarily because wc did 
not yet have at our disposal a method suitable for handling a largo system of simultaneous 
equations. Such a method is found in matrix algebra, the subject of this chapter and the next. 

Matrix algebra can enable us to do many things. In the first place, it provides a compact 
way of writing an equation system, even an extremely large one. Second, it leads to a way 
of testing the existence of a solution by evaluation of a determinant —a concept closely 
related to that of a matrix. Third, it gives a method of finding that solution (if it exists). 
Since equation systems are encountered not only in static analysis but also in comparative- 
static and dynamic analyses and in optimization problems, you will find ample application 
of matrix algebra in almost every chapter that is to follow. This is why it is desirable to in¬ 
troduce matrix algebra early. 

However, one slight catch is that matrix algebra is applicable only to to<?ar-cquaiion 
systems. How realistically linear equations can describe actual economic relationships de¬ 
pends, of course, on the nature of the relationships in question. In many cases, even if some 
sacrifice of realism is entailed by the assumption of linearity, an assumed linear relation¬ 
ship can produce a sufficiently close approximation to an actual nonlinear relationship to 
warrant its use. 

In other cases, while preserving the nonlinearity in the model, we can effect a transfor¬ 
mation of variables so as to obtain a linear relation to work with. For example, the nonlinear 
function 

v = ax b 

can be readily transformed, by taking the logarithm on both sides, into the function 

logy = logo +Alogx 
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which is linear in the rwo variables (log_y) and (log.t), (Logarithms will be discussed in 
more detail in Chap. 10.). More importantly, in many applications such as comparative- 
static analysis and optimization problems, discussed subsequently, although the original 
formulation of the economic model is nonlinear in nature, linear equation systems will 
emerge in the course of analysis. Thus the linearity restriction is not nearly as restrictive as 
it may first appear. 
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The iwo-commodity market model (3.12) can be written—alter eliminating the quantity 
variables as a system of two linear equations, as in (3.13'), 


c\P\ -\-ciPi = -Co 
Y\ p \ + Y2 p 2 = ~Yq 


where the parameters co and yo appear to the right of the equals sign. In general, a system 
of m linear equations in n variables (*i, * 2 , - • •. *«) can also be arranged into such a 
format: 


«i |.x[ + a\ 2*2 + • • • -Mu** = d\ 

<'2l*l + <722*2 + • • - + <72„** =d2 . M 

(4.1) 


#wi*i T £Jw2*2 T ■ ■ ■ T o mn x ,j — d m 

In (4.1), the variable*! appears only within the leftmost column, and in general the vari¬ 
able xj appears only in the yth column on the left side of the equals sign. The double- 
subscripted parameter symbol represents the coefficient appearing in the ilh equation 
and attached to the /th variable. For example, <221 is the coefficient in the second equation, 
attached to the variable *|. The parameter dj which is unattached to any variable, on the 
other hand, represents the constant term in the fth equation. Kor instance, </[ is the constant 
term in the first equation. All subscripts arc therefore keyed to the specific locations of the 
variables and parameters in (4.1). 

Matrices as Arrays 

There are essentially three types of ingredients in the equation system (4.1), The first is the 
set of coefficients u; ; -; the second is the set of variables*!, and the last is the set of 

constant terms d \,..., d,„. If we arrange the three sets as three rectangular arrays and label 
them, respectively, as A,x, and d (without subscripts), then we have 



-fin 

012 

■ ■ - flu ■ 


m X\ " 



A = 

ax 

an 

- •' Qjn 

V » 

Xi 

d — 

di 


.... 

_<w 

a m2 







(4.2) 
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As a simple example, given the linear-equation system 

6a*i + 3x2 4~ x’3 ^ 22 
a* f + Ax 2 — 2x} = 1 2 
4A[ - X2 + 5*3 = 10 

we can write 



'6 

3 

r 

' 

A, 


. 22’ 

= 

] 

4 

-2 

* = ' 

A 2 

d = 

12 


4 

-1 

5 


_*3_ 


10 


(4.3) 


(4.4) 


Each of the three arrays in (4.2) or (4.4) constitutes a matrix. 

A matrix is defined as a rectangular array of numbers, parameters, or variables. The 
members of the array, referred to as the element .? of the matrix, are usually enclosed in 
brackets, as in (4.2), or sometimes in parentheses or with double vertical lines: || ||. Note 
that in matrix A (the coefficient matrix of the equation system), the elements are separated 
not by commas but by blank spaces only, As a shorthand device, the array in matrix A can 
be written more simply as 


A = [an] 


i = 1, 2,..., m \ 
7 = 1.2. n) 


Inasmuch as the location of each element in a matrix is unequivocally fixed by the sub¬ 
script, every matrix is an ordered set. 


Vectors as Special Matrices 

The number of rows and the number of columns in a matrix together define the dimension 
of the matrix. Since matrix A in (4.2) contains m rows and n columns, it is said to be of 
dimension m x n (read ‘'m by «”). It is important to remember that the row number always 
precedes the column number; this is in line with the way the two subscripts in a tj are 
ordered. In the special case where m = n, the matrix is called a square matrix; thus the 
matrix A in (4.4) is a 3 x 3 square matrix. 

Some matrices may contain only one column, such as x and d in (4.2) or (4,4). Such 
matrices arc given the special name column vectors. In (4.2), the dimension of x is n x 1, 
and that of d is /«x|; in (4.4) both x and d are 3 x I. If we arranged the variables xj in a 
horizontal array, though, there would result a 1 x n matrix, which is called a row vector. For 
notation purposes, a row vector is often distinguished from a column vector by the use of a 
primed symbol: 

x' = [x, X 2 ••• -V n ] 

You may observe that a vector (whether row or column) is merely an ordered /r-tuple. and 
as such it may sometimes be interpreted as a point in an a-dimensional space. In turn, the 
m x n matrix A can be interpreted as an ordered set of m row' vectors or as an ordered set 
of n column vectors. These ideas will be followed up in Chap. 5. 

An issue of more immediate interest is how the matrix notation can enable us, as 
promised, to express an equation system in a compact way. With the matrices defined in 
(4.4). we can express the equation system (4.3) simply as 


Ax = d 
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In fact, if A, x. and d are given the meanings in (4.2), then even the general-equation 
system in (4,1) can be written as Ax = d. The compactness of this notation is thus 
unmistakable. 

However, the equation Ax = d prompts at least two questions. How do we multiply two 
matrices A and x? What is meant by the equality of Ax and d'l Since matrices involve 
whole blocks of numbers, the familiar algebraic operations defined for single numbers are 
not directly applicable, and there is a need for anew set of operational rules. 


EXERCISE 4.1 

1. Rewrite the market model (3.1) in the format of (4.1), and show that, if the three vari¬ 
ables are arranged in the order Qd, Q Sl and P, the coefficient matrix will be 

'1 -1 O ' 

1 0 b 

.0 1 -d_ 

How would you write the vector of constants? 

2. Rewrite the market model (3.12) in the format of (4.1) with the variables arranged in 
the following order: Qdi, Q s i, Qei, Qq, Pt, /V Write out the coefficient matrix, the 
variable vector, and the constant vector. 

3. Can the market model (3,6) be rewritten in the format of (4.1)? Why? 

4. Rewrite the national-income model (3.23) in the format of (4.1), with Y as the first vari¬ 
able. Write out the coefficient matrix and the constant vector. 

5. Rewrite the nationalncome model of Exercise 3.5-1 in the format of (4.1), with the 
variables in the order Y, T, and C. [Hint: Watch out for the multiplicative expression 
b(T - T) in the consumption function,] 


4.2 Matrix Operations 


As a preliminary, let us first define the word equality. Two matrices A = [«, ; ] and R = \h,,] 
are said to be equal if and only if they have the same dimension and have identical dements 
in the corresponding locations in the array, In other words. A = B if and only if a, - - b h 
for all values of i and j . Thus, for example, we find 


r 4 31 


'4 3l p 

l O' 

i- 

to 

o 

1_ 

— 

2 °Ju 

1 3 


As another example, if 


X 


'7' 

m y m 


4 


, this will mean that x = 7 and v = 4. 


Addition and Subtraction of Matrices 

Two matrices can be added if and only if they have the same dimension. When this dimen* 
sional requirement is met, the matrices are said to be conformable for addition. In that case, 
the addition of A = [Cj{\ and B = [A,J is delined as the addition of each pair of corre¬ 
sponding elements, 
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Example 1 


Example 2 


Example 3 


Example 4 


Example 5 


Example 6 


4 9' 

+' 

'2 O' 


'4 + 2 

9 + 0' 


"6 9' 

2 1_ 

0 7_ 


2 + 0 

1 + 7 . 


2 8 


On 

012 " 


bn 

hi 2 


'on +£>n 

012 + bi 2 ' 

021 

022 

+ 

b 2 i 

hi 

= 

021 + i >21 

022 + bl 2 

031 

032 . 


_ fc -31 

b 3 2 _ 


.031 +t >31 

O 32 + bn _ 


In general, we may state the rule thus: 

[Qtf] + [hi] = [cu) where o; = o; ; + fc; 

Note that the sum matrix [c/y] must have the same dimension as the component matrices 
[Qij] and [bij]. 


The subtraction operation A - B can be similarly defined if and only if A and B have 
the same dimension. The operation entails the result 

[a,/] - [ha] - [d,j] where d tj = a fJ - hi) 


19 V 


6 8‘ 


'19-6 3-8" 


ri3 -51 

2 0 


1 3 


2-1 0-3 


1 _ 


The subtraction operation A - 8 may be considered alternatively as an addition operation 
involving a matrix A and another matrix (-1)8. This, however, raises the question of what 
is meant by the multiplication of a matrix by a single number (here, -1). 


Scalar Multiplication 

To multiply a matrix by a number — or in matrix-algebra terminology, by a scalar— is to 
multiply every element of (hat matrix by the given scalar. 


'3 -1" 


'21 -7' 

0 5 


0 35 _ 


1 

On on] 


2^11 \on 

2 

Q 2 i 0 2 2J 


2°21 2 O 22 


From these examples, the rationale of the name scalar should become dear, for it "scales 
up (or down)" the matrix by a certain multiple. The scalar can, of course, be a negative 
number as well. 


’011 

an 

eh ' 


r —Oil 

-Oi2 

-d} * 

,0 2 1 

an 

h. 


.-021 

-0 2 2 

-<k m 


Note that if the matrix on the left represents the coefficients and the constant terms in the 
simultaneous equations 

OllXl +Ql2*2 = di 

021*1 + 022*2 = <h 

then multiplication by the scalar -1 will amount to multiplying both sides of both equa¬ 
tions by -1, thereby changing the sign of every term in the system. 
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Multiplication of Matrices 

Whereas a scalar can be used to multiply a matrix of any dimension, the multiplication of 
two matrices is contingent upon the satisfaction of a different dimensional requirement. 

Suppose that, given two matrices A and B. wc want to find the product AR. The 
conformabi lity condition for multiplication is that the column dimension of A (the "lead” 
matrix in the expression A B) must be equal to the row dimension of B (the “lag” matrix). 
For instance, if 


A = [a ii Qn] B 

<1x2} (2x3) 


k)\ b i2 bu 

*21 *22 *23 


(4.5) 


the product AB then is defined, since A has two columns and B has two rows precisely the 
same number.’ This can be checked at a glance by comparing the second number in the 
dimension indicator for A . which is (1 x2), with the firs! number in the dimension indica¬ 
tor for B, (2 x 3). On the other hand the reverse product BA is not defined in this case, 
because B (now the lead matrix) has three columns while A (the lag matrix) has only one 
row; hence the conformabi lity condition is violated 

In general, if <4 is of dimension m x n and B is of dimension p x </, the matrix product 
AB will be defined if and only if n = p. If defined, moreover, the product matrix AB will 
have the dimension m x q the same number of rows as the lead matrix A and the same 
number of columns as the lag matrix B. For the matrices given in (4.5), AB will be 1 x 3. 

It remains to define the exact procedure of multiplication. For this purpose, let us take 
the matrices A and B in (4.5) for illustration. Since the product AB is defined and is 
expected to be of dimension I x 3, wc may write in general (using the symbol C rather than 
c for the row vector) that 


AB = C = [cu c 12 Cn] 

Each element in the product matrix C, denoted by , is defined as a sum of products, to be 
computed from the elements in the ith row of the lead matrix A, and those in theyth column 
of the lag matrix B. To find c \\, for instance, wc should take th tz first row in A (since / = 1) 
and the first column in B (since j = 1)- - as shown in the top panel of Fig. 4.1—and then 
pair the elements together sequentially, multiply out each pair, and take the sum of the 
resulting products, to get 


c'n = a\)b\] +a ]2 b 2 \ (4,6) 

Similarly, for (?] 2 , we take the first row in A (since i = 1) and the second column in B (since 
j = 2), and calculate the indicated sum of products —in accordance with the lower panel of 
Fig. 4.1—as follows: 


C\2 = #11*12 + #12*22 (4.6') 

By the same token, we should also have 

C|3 = 11*13 + #12*23 (4.6") 


f The matrix A, being a row vector, would normally be denoted by o'. We use the symbol A here to 
stress the fact that the multiplication rule being explained applies to matrices in general, not only to 
the product of one vector and one matrix. 
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FIGURE 4.1 


Example 7 


Example 8 


Fore,,: First pair 



It is the particular pairing requirement in this process which necessitates the matching of 
the column dimension of the lead matrix and the row dimension of the lag matrix before 
multiplication can be performed. 

The multiplication procedure illustrated in Fig. 4.1 can also be described by using the 
concept of the inner product of two vectors. Given two vectors u and v with n elements 

each, say, (»,,w 2 . u„) and (v,t>„), arranged either as two rows or as two 

columns or as one row and one column, their inner product, written as u • v (with a dot in 
the middle), is defined as 

U ■ 1,' = U |Ui + U 2 V2 H-+ u„v n 

This is a sum of products of corresponding dements, and hence the inner product of two 
vectors is a scalar. 

If, after a shopping trip, we arrange the quantities purchased of rr goods as a row vector 
Q'= [Qi Q 2 ••• Qn], and list the prices of those goods in a price vector P'= 
[Pi Pi P n ], then the inner product of these two vectors is 

Q'. p' = Q, Pi + Q 2 P 2 + - • • + Q„P„ = total purchase cost 

Using this concept, we can describe the element c (/ in the product matrix C = AB 
simply as the inner product of the /throw of the lead matrix A and the yth column of the lag 
matrix B. By examining Fig. 4.1, we can easily verify the validity of this description. 

The rule of multiplication just outlined applies with equal validity when the dimensions 
of A and B are other than those illustrated in Fig. 4.1; the only prerequisite is that the con- 
formabilily condition be met. 

Given 


4 = 

2 

3 " 

8 

and 

B = 

5 

Q 

(3x2) 

4 

0 


(2x1) 

7 
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find >16. The product AB is indeed defined because A has two columns and B has two rows. 
Their product matrix should be 3 x 1, a column vector: 



' 1(5)+ 3(9)' 


'32" 

AB - 

2(5) + 8{9) 

= 

82 


.4(5)+ 0(9). 


20 


Example 9 


Given 







0 

1 

3 ‘ 

'3 

-1 

2" 



5 

10 

4 

0 

0 

3 

2 

and 

B = 
(3*3) 

-1 

0 

1 

5 

2 

7 

10 

1 






s 

10 . 


find AB. The same rule of multiplication now yields a very special product matrix: 



'0 + 1 +0 

3 1,4 

- 5 - 5+5 

9 7 

10 10 

2 " 
10 


’1 

0 

o' 

AB = 

0 + 0 + 0 

-5+0+1 

+ 0 ” 

3 

10 

= 

0 

i 

0 


0 + 0 + 0 

-5 + 0 + ! 

12 + o _ 
10 + U 

2 

10 w 


0 

0 

1 


This last matrix—a square matrix with Is in its principal diagonal (the diagonal running from 
northwest to southeast) and Os everywhere else—exemplifies the important type of matrix 
known as the identity matrix. This will be further discussed in Section 4.5. 


Example 10 Let us now the matrix A and the vector x as defined in (4.4) and find Ax. The product 
- matrix is a 3 x 1 column vector: 


'6 3 1 


iV 


6 x 1 + 3x 2 + xi 

1 4 -2 


*2 

= 

*i + 4x 2 - 2 x 3 

4-15 


_*3_ 


_4xi — x 2 + 5 x 3 


(3x3) (3x1) (3x1) 


Note: The product on the right is a column vector, its corpulent appearance notwithstand¬ 
ing! When we write Ax = d, therefore, we have 


6 xq + 3x 2 + X 3 

! 

'22* 

Xi + 4x 2 - 2x3 

= 

12 

_4xi — *2 + 5-*3_ 


10 


which, according to the definition of matrix equality, is equivalent to the statement of the 
entire equation system in (4.3). 

Note that, to use the matrix notation Ax = d t it is necessary, because of the conforms- 
bility condition, to arrange the variables x/ into a column vector, even though these vari¬ 
ables are listed in a horizontal order in the original equation system. 


Example 11 


The simple national-income model in two endogenous variables Y and C, 

Y = C + l 0 + Co 
C = a + bY 
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can be rearranged into the standard format of (4.1) as follows: 

Y - C = io + Co 
-bY + C = a 

Hence the coefficient matrix A, the vector of variables x, and the vector of constants d are 



i -l 

1 1 

‘Y‘ 


+ C 0 

A = 

(2x2) 

m -b 1 

1! 

C 

d = 

(2x1 ) 

a 


Let us verify that this given system can be expressed by the equation Ax = d. 
By the rule of matrix multiplication, we have 


1 -T 

T 


"l(f) + (-l)(Q" 


■ y-C ' 

b 1. 

.C. 


. -b(Y) + HC) _ 


-bf + C. 


Thus the matrix equation Ax = d would give us 


- y_ C ' 


,-t>y + c. 



Since matrix equality means the equality between corresponding elements, it is clear that 
the equation Ax = d does precisely represent the original equation system, as expressed in 
the (4.1) format. 


The Question of Division 

While matrices, like numbers, can undergo the operations of addition, subtraction, and 
multiplication—subject to the conformabilily conditions—it is not possible to divide one 
matrix by another. That is, we cannot write A/3. 

For two numbers a and b, the quotient a jb (with b 4- 0) can be written alternatively as 
ab~ ] or/i ‘V where/]' 1 represents the inverse or reciprocal of b. Since ub~ ] - b~ l a , the 
quotient expression a/b can be used lo represent both ah 1 and h 1 w. The case of matrices 
is different. Applying the concept of inverses to matrices, we may in certain cases (dis¬ 
cussed in Sec. 4.6) define a matrix B~' that is the inverse of matrix B. But from the dis¬ 
cussion of the conformability condition it follows that, if A B 1 is defined, there can be no 
assurance that B~ [ A is also defined. Even Afi~ l and B~'.4 are indeed both defined, they 
still may not represent the same product. Hence the expression A jB cannot be used with¬ 
out ambiguity, and it must be avoided. Instead, you must specify whether you are referring 
lo A B~' or B~ ] A— provided that the inverse /i" 1 docs exist and that the matrix product in 
question is defined. Inverse matrices will be further discussed in Sec. 4.6, 

The X Notation 

The use of subscripted symbols not only helps in designating the locations of parameters 
and variables but also lends itself to a flexible shorthand for denoting sums of terms, such 
as those which arose during the process of matrix multiplication. 

The summation shorthand makes use of the Greek letter I (sigma, for “sum”). To 
express the sum of xj, xj, and x$, for instance, we may write 

X] + a'2 +X} = T>, 
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which is read as “the sum of xj as j ranges from 1 lo 3 ” The symbol /, called the summa¬ 
tion index, takes only integer values. The expression x, represents the summand (that which 
is to be summed), and it is in effect a function of/. Aside from the lettersummation 
indices are also commonly denoted by / or k> such as 

7 

=X 3 + Xa +X$+Xt-\- Xi 
n 

=x 0 +.Ti H- \-x„ 

k=0 


The application of £ notation can be readily extended to cases in which the x term is 
prefixed with a coefficient or in which each term in the sum is raised to some integer power. 
For instance, wc may write: 

3 3 

= ax) + axi + ax 3 = a{x\ + xi + rO = a ^ 


7=1 
3 


^2 a J X J = a \ x \ + a 2*2 + C12X2 


7 = 1 


7 = 1 


n 

^^djX 1 = (iox u + a\x l + aix“ + -b a„x n 

1=0 

= tto H - a |,v + u^x} H- • • • + a n x n 


n 

The last example, in particular, shows that the expression y^q ( x J can in fact be used as a 

shorthand form of the general polynomial function of (2,4). 

It may be mentioned in passing that, whenever the context of the discussion leaves no 
ambiguity as to the range of summation, the symbol £ can be used alone, without an index 
attached (such as or with only the index letter underneath (such as ^ x \). 

Let us apply the £ shorthand to matrix multiplication, in (4.6), (4.6'), and (4.6”). each 
element of the product matrix C = A 8 is defined as a sum of terms, which may now be 
rewritten as follows: 


2 

Cll = *11^11 + 12^21 = 

;•=] 

2 

C ’|2 = ^ 11^12 + ^ 12^22 = 

4 = 1 
2 

C\) = a\\b\$ +a 12^23 = y^fljA-3 

4=1 


In each case, the first subscript of c\j is reflected in the first subscript of an* and the sec¬ 
ond subscript of c\j is reflected in the second subscript of in the £ expression. The 
index on the other hand, is a “dummy” subscript; it serves to indicate which particular 
pair of elements is being multiplied, but it does not show' up in the symbol c \ r 
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Extending this to the multiplication of an m x n matrix A = [a,-*] and an n x p matrix 
B = [bft,], we may now write the elements of the m x p product matrix A B = C = loj cis 


n 

c\ i = 

t=i 




or more generally. 


47; 



; = l, 2. m\ 

/' = 1 , 2 ,..., J 


This Iasi equation represents yet another way of stating the rule of multiplication for the 
matrices defined above. 


EXERCISE 4.2 


1. Given A = 

(a) A + B 

2. Given A = 


]■ 


7 -1 

6 9 

(b) C - A 

'2 8 
3 0 
5 1 


8 = 


0 
3 

(c) 3/4 


4 

-2 




, and C = 

(d) 48 + 2C 

, and C = 


find: 


(a) Is AB defined? Calculate Afi, Can you calculate BA? Why? 

(b) Is SC defined? Calculate SC. Is CS defined? If so, calculate CB. Is it true that 8C= CB? 

3. On the basis of the matrices given in Example 9, is the product BA defined? If so, 
calculate the product. In this case do we have AB = BA? 

4. Find the product matrices in the following (in each case, append beneath every matrix 
a dimension indicator): 


(°) 


(b) 



"8 

0 

(c) 

'3 5 O' 

* X' 


0 

1 

A 2 -7. 

y 


_3 

5. 



_ z. 



'4 -t" 


'7 O' 


5 2 

(d)[o b c] 

0 2 

- 

0 1 


1 4 


5. In Example 7, if we arrange the quantities and prices as column vectors instead of row 
vectors, is Q- P defined? Can we express the total purchase cost as Q P? As Q • P? As 

Q. P7 

6. Expand the following summation expressions: 

(ci)X> 

i -2 '=1 

(e)'£(x + i ) 2 

J-S 1=0 

w £ 
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7. Rewrite the following in E notation: 

(0) Xi (XT - 1) + 2X 2 (K2 - 1)+ 3X 3 (X3 - 1) 

(b) 02(X3 + 2) + ojO + 3) + ff4(x 5 4- 4) 

(c) l + h + ' + 7° ( ^ 0) 

«0i + ± + 4 + --i (x?0) 

8. Show that the following are true: 

/ n \ n-1 

(o) E *1 ) + *n+l = E *; 

Vf- 0 / j=0 

< b > E ab iY; =ai b ,y, 

/=1 /=i 

(C) E (*/ + w) - E x t -t-1 a 

,=i j-i /-i 


4,3 Notes on Vector Operations _ 

In Secs. 4.1 and 4.2. vectors are considered as a special type of matrix. As such, they qual¬ 
ify for the application of all the algebraic operations discussed. Owing to their dimensional 
peculiarities, however, some additional comments on vector operations are useful. 

Multiplication of Vectors 

An m x I column vector u, and a 1 x n row vector v\ yield a product matrix uv f of 
dimension m x n. 


Example 1 


Given u = 


3 

2 


and v 1 = [1 
uv 


5]. 

we can 

get 




■3(1) 

3(4) 

3(5)- 


■3 

12 15' 

.2(1) 

2(4) 

2(5). 


.2 

8 10. 


Since each row in u consists of one element only, as does each column in v\ each element 
of uv' turns out to be a single product instead of a sum of products. The product uv' is a 
2x3 matrix, even though what we started out with are a pair of vectors. 


On the other hand, given a I x n row' vector u and an n x I column vector v, the prod¬ 
uct uv will be of dimension Ixl. 


Example 2 


Given u' = [3 4] and v = [7 ]' w€ have 

u’v = [3(9) + 4(7)] = [55] 


As written, w'u is a matrix, despite the fact that only a single element is present. However. 
I x I matrices behave exactly like scalars with respect to addition and multiplication: 
[4] + [8] = [12], just as 4 + 8 = 12: and [3] [7] = [21 ], just as 3(7) = 21. Moreover, I x I 
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Example 3 


matrices possess no major properties that scalars do not have. In fact, there is a one-to-one 
correspondence between the set of all scalars and the set ofall 1 x 1 matrices whose ele¬ 
ments are scalars, For this reason, we may redefine u'v lobe the.scalar corresponding to the 
1 x 1 product matrix. For Example 2. we can accordingly write u'v = 55. Such a product is 
called a scalar product Remember, however, that while a 1 x 1 matrix can be treated as a 
scalar, a scalar cannot be replaced by a 1 x 1 matrix at will if further calculation is to be 
carried out, because complications regarding eonformability conditions may arise. 


Given a row vector u' = [ 3 6 9 ], find t Su. Since u is merely the column vector with the 
elements of u' arranged vertically, we have 


u'u = [ 3 6 9] 


3 

6 

9 


= (3) 2 + (6) 2 + (9) 2 


where we have omitted the brackets from the 1 x 1 product matrix on the right. Note that 
the product u'u gives the sum of squares of the elements of u. 

In general, if if = [m ui ■■■ u n ], then u'u will be the sum of squares (a scalar) of the 
elements u ; : 

n 

u’u = u 2 + u| -|-4- uj; = ^ u 2 

l =1 


Had we calculated the inner product u-u (or u' ■ u'), we would have, of course, obtained 
exactly the same result. 


To conclude, it is important to distinguish between the meanings of uv' (a matrix larger 
than 1 x 1) and u'u (a l x I matrix, or a scalar). Observe, in particular, that a scalar 
product must have a row vector as the lead matrix and a column vector as the lag matrix; 
otherwise the product cannot be 1 x 1. 


Geometric Interpretation of Vector Operations 

It was mentioned earlier that a column or row vector with n elements (referred to hereafter 
as an K-vwfOf*) can be viewed as an n-tuple, and hence as a point in an /i-dimcnsiona! space 
(referred to hereafter as an n-space). Let us elaborate on this idea. In Fig. 4.2a, a point (3,2) 
is plotted in a 2-space and is labeled u. This is the geometric counterpart of the vector 

or the vector u' = [ 3 2], both of which indicate in this context one and the 


u = 


same ordered pair. If an arrow (a dirccted-1 inc segment) is drawn from the point of origin 
(0,0) to the point u, it will specify the unique straight route by which to roach the destina¬ 
tion point u from the point of origin. Since a unique arrow exists for each point, we can 
regard the vector u as graphically represented either by the point (3, 2), or by the corre¬ 
sponding arrow. Such an arrow, which emanates from the origin (0, 0) like the hand of a 
clock, with a definite length and a definite direction, is called a radius vector. 


f The concept of scalar product is thus akin to the concept of inner product of two vectors with the 
same number of elements in each, which also yields a scalar. Recall, however, that the inner product is 
exempted from the eonformability condition for multiplication, so that we may write it as u ■ v. in the 
case of scalar product (denoted without a dot between the two vector symbols), on the other hand, 
we can express it only as a row vector multiplied by a column vector, with the row vector in the lead. 
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FIGURE 4.2 


I 






Following this new interpretation of a vector, it becomes possible to give geometric 
meanings to (a) the scalar multiplication of a vector, (ft) the addition and subtraction of vec¬ 
tors, and more generally, (c) the so-called linear combination of vectors. 

First, if we plot the vector ^ =2um Fig. 4.2a, the resulting arrow' will overlap the 

old one but will be twice as long, in fact, the multiplication of vector u by any scalar F will 
produce an overlapping arrow, but the arrowhead will be relocated, unless k — 1. If the 
scalar multiplier is k > 1, the arrow' will be extended out (scaled up); ifO < k < 1, the 
arrow will be shortened (scaled down); if k = 0. the arrow will shrink into the point of 

‘O' 


origin—which represents a mtli vector, J . A negative scalar multiplier will even reverse 

the direction of the arrow. If the vector u is multiplied by -L, for instance, we get -u — 
-3l 

2 , and this plots in Fig. 4.2ft as an arrow of the same length as it but diametrically 
opposite in direction. 

1 


Next, consider the addition of two vectors, t; = 


andw = 


_ T 
- 2 ' 


The sum v + u = 


4 
6 

with the two vectors « and v (solid arrows) as two of its sides, however, the diagonal of the 


can be directly plotted as the broken arrow in Fig. 4.2c. If we constructs parallelogram 
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Example 4 


Example 5 


parallelogram will turn out exactly to be the arrow representing the vector sum v + u. In 
general, a vector sum can be obtained geometrically from a parallelogram. Moreover, this 
method can also give us the vector difference r - u, since the latter is equivalent to the sum 
of t,' and (— I )w, In Fig. 4.2 d, we first reproduce the vector t> and the negative vector -w 
from diagrams c and b, respectively, and then construct a parallelogram. The resulting 
diagonal represents the vector difference i; — u. 

It lakes only a simple extension of these results to interpret geometrically a linear 
combination (i.e., a linear sum or difference) of vectors. Consider the simple case of 


' 1 ' 

+ 2 

'3' 


' 9' 

4 

L 2 

— 

16 


The scalar multiplication aspect of this operation involves the relocation of the respective 
arrowheads of the two vectors v and «, and the addition aspect calls for the construction of 
a parallelogram. Beyond these two basic graphical operations, there is nothing new in a lin¬ 
ear combination of vectors. This is true even if there are more terms in the linear combina¬ 
tion. as in 

U 

^kjVt = Mi +M r 2 H— + M« 

1=1 

where lq are a set of scalars but the subscripted symbols v, now denote a set of vectors. To 
form this sum, the first two terms may be added first, and then the resulting sum is added to 
the third, and so forth, till all terms are included. 

Linear Dependence 

A set of vectors u is said to be linearly dependent if {and only if) any one of them 

can be expressed as a linear combination of the remaining vectors; otherwise they are 
linearly independent. 


The three vectors vi = 


, V2 = 


, and V 3 = 


are linearly dependent because v 3 


is a linear combination of vi and V 2 : 

3vi - 2V2 = 

Note that this last equation is alternatively expressible as 

3vi - 2 v 2 - V 3 = 0 


6 ~ 


2" 


4" 

21 _ 


16_ 


5_ 


= ^3 


where 0 = 


represents a null vector (also called the zero vector). 


The two row vectors = [5 12] and v’ 2 - [10 24] are linearly dependent because 

2v\ = 2[5 12] = [10 24] = v *2 

The fact that one vector is a multiple of another vector illustrates the simplest case of linear 
combination. Note again that this last equation may be written equivalently as 

lv\ -v 2 = 0' 

where 0 f represents the null row vector [0 0 ]. 
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With the introduction of null vectors, linear dependence may be redefined as follows. A 
set of m -vectors ri,..., v„ is linearly dependent if and only if there exists a set of scalars 
Ai,..., k n (not all zero) such that 





[f this equation can be satisfied only when k, = 0 for all i, on the other hand, these vectors 
are linearly independent. 

The concept of linear dependence admits of an easy geometric interpretation also. Two 
vectors u and 2u —one being a multiple of the other—are obviously dependent. Geometri¬ 
cally, in Fig. 4.2a, their arrows lie on a single straight line. The same is true of the two 
dependent vectors u and —u in Fig. 4.2b. Tn contrast, the two vectors u and jj of Fig. 4.2c 
are linearly independent, because it is impossible to express one as a multiple of the other. 
Geometrically, their arrows do not lie on a single straight line. 

When more than two vectors in the 2-spacc arc considered, there emerges this significant 
conclusion: once wc have found two linearly independentvectors in the 2-space (say, u and u), 
all the other vectors in that space will be expressible as a linear combination ofthese (u and r). 
In Fig. 4.2c and d, it has already been illustrated how the two simple linear combinations r -I- u 
and v — u can be found. Furthermore, by extending, shortening, and reversing the given vec¬ 
tors wand o and then combining these into various parallelograms, we can generate an infinite 
number of new vectors, which will exhaust the set of all 2-vectors. Because of this, any set of 
three or more 2-vectors (three or more vectors in a 2-spacc) must be linearly dependent. Two 
of them can be independent, hut then the thiTd must bo a linear combination of the first two. 


Vector Space 

The totality of the 2-vectors generated by Ihe various linear combinations of two indepen¬ 
dent vectors u and v constitutes the two-dimensional vector space. Since wc arc dealing 
only with vectors with real-valued elements, this vector space is none other than R 2 , the 
2-space we have been referring to all along. The 2-space cannot be generated by a single 
2-vector, because linear combinations of the latter can only give rise to the set of vectors 
lying on a single straight line. Nor does the generation of the 2-space require more than two 
linearly independent 2-vectors—at any rate, it would be impossible to find more than two. 

The two linearly independent vectors u and v are said to span the 2-space. They are also 
said to constitute a basis for the 2-space. Note that we said a basis, not the basis, because 
any pair of 2-vectors can serve in that capacity as long as they are linearly independent. In 
particular, consider the two vectors [I 0] and [0 1], which are called unit vectors. The 
first one plots as an arrow lying along the horizontal axis, and the second, an arrow lying 
along the vertical axis, Because they are linearly independent, they can serve as a basis for 
the 2-spacc, and we do in fact ordinarily think of the 2-space as spanned by its two axes, 
which are nothing but the extended versions of the two unit vectors. 

By analogy, the three-dimensional vector space is the totality of 3-vectors, and it must 
be spanned by exactly three linearly independent 3-vectors. As an illustration, consider the 
set of three unit vectors 



"f 


"o 


"o" 

Cl = 

0 

^2 - 


C3 = 

0 


0 


0 


1 


(4.7) 
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FIGURE 4.3 



where each £>,■ is a vector willi J as its ith element and with zeros elsewhere.' These three 
vectors are obviously linearly independent; in faet, their arrows lie on the three axes of the 
3-space in Fig. 4.3. Thus they span the 3-spaec. which implies that the entire 3-space (/?•’, 


in our framework) can be generated from those unit vectors. For example, the vector 


2 

2 


can be considered as the linear combination e [ + ley + ley. Geometrically, we can first 
add the vectors e\ and ley in Fig. 4.3 by the parallelogram method, in order to gel the vec¬ 
tor represented by the point (1, 2, 0) in the X|X 2 plane, and then add the latter vector to 
2e 3 —via the parallelogram constructed in the shaded vertical plane—to obtain the desired 


final result, at the point (1, 2, 2). 

The further extension to n-space should be obvious. The n-space can be defined as the 
totality of ^-vectors. Though nongraphablc, we can still think of the n-space as being 
spanned by a total of n (^-element) unit vectors that are all linearly independent. Each 
«-vector. being an ordered «-tuple. represents a point in the n-spacc, or an arrow extending 
from the point of origin (i.e., the ^-element null vector) to the said point. And any given set 
of n linearly independent n-vectors is, in fact, capable of generating the entire rt-spacc. 
Since, in our discussion, each element of the n -vector is restricted to be a real number, this 
n-spacc is in fact R". 

The u-space we have referred to is sometimes more specifically called the Euclidean 
n-apace (named after Euclid). To explain this latter concept, we must first comment briefly 
on the concept of distance between two vector points. For any pair of vector points u and i; 
in a given space, the distance from u to u is some real-valued function 


d = d(u, u) 


with the following properties: (1) when » and v coincide, the distance is zero; (2) when the 
two points are distinct, the distance from u to v and the distance from v to u are represented 


f The symbol e may be associated with the German word eins, for "one. 
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by an identical positive real number; and (3) the distance between u and i; is never longer 
than the distance from u to w (a point distinct from u and u) plus the distance from w to t ! . 
Expressed symbolically, 

d(u, l') =0 (foru = v) 

d(u.v) = d(v.u)>{) (for m 

d(u. l'| £ d(u. tv) + t/(n; u) (for tv j= u. v) 

The last property is known as the triangular inequality, because the three points u. u, and 
»v together will usually define a triangle. 

When a vector space has a distance function defined that fulfills the previous three prop¬ 
erties Jt is called a metric space. However, note that the distance d(u, u) has been discussed 
only in general terms. Depending on the specific form assigned to the d function, there may 
result a variety of metric spaces. The so-called Euclidean space is one specific type of 
metric space, with a distance function defined as follows. Let point u be the u-tuplc 
(U|, 02 ,..., o„) and point t; be the fl-tuple {b\. /in,..., />„); then the Euclidean distance 
function is 

d(u, u) = \J{a\ — hi ) 2 + (a- z — hj) 2 + • • • -I- [a r , — b„)~ 

where the square root is taken to be positive. As can be easily verified, this specific distance 
function satisfies all three properties previously enumerated. Applied to the two- 
dimensional space in Fig. 4.2a. the distance between the two points (6. 4) and (3, 2) is 
found to be 

v (6 - 3) 2 + (4 ~2f = V3 2 + 2 2 - VT3 

This result is seen to be consistent with Pythagoras's theorem, which slates that the length 
of the hypotenuse of a right-angled triangle is equal to the (positive) square root of the sum 
of the squares of the lengths of the other two sides. For if we take (6.4) and (3.2) to be u 
and ?.\ and plot anew point w at (6,2), wc shall indeed have aright-angled triangle with the 
lengths of its horizontal and vertical sides equal to 3 and 2, respe ctively, and the length of 
the hypotenuse (the distance between u and u) equal to V3 2 + 2 3 = VT3. 

The Euclidean distance function can also be expressed in terms of the square root of 
a scalar product of two vectors. Since u and v denote the two n-tuples (m..... u„) and 
{hi,.... b„), we can write a column vector u - v. with elements a\ - hi, a 2 - b 2 ,.... 
a* - b„. What goes under the square-root sign in the Euclidean distance function is, of 
course, simply the sum of squares of these n elements, which, in view of Example 3 of this 
section, can be written as the scalar product (u - u)'(n - t>). Hence wc have 

d(u, v) = v(« - i’V(w - A) 


EXERCISE 4.3 

1, Given (/ = [5 1 3],v' = [3 1 - 1], w' = [7 5 8 ], and x' - [*i X 2 X 3 ], write 
out the column vectors, u, v, w, and x, and find 
(0) uv' (c) xx 1 (e) u'v 

(1 b) uW ( d ) v'u (f) Wx 


( 9 ) u'u 
(h) x'x 
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2. Given w = 

3 - 

2 

,x = 

M.,= 

1, and z = 

V 


.16. 




% Zl m 


(a) Which of the following are defined: w'x.x'y', xy', y'y, zz‘, yw', x ■ yl 

(b) Find all the products that are defined. 

3. having sold n items of merchandise at quantities Qi__ Q n and prices Pi- P n , 

how would you express the total revenue in (a) Y. notation and (b) vector notation? 

4. Given two nonzero vectors wi and W 2 , the angle 0 (0 < 9 < 180') they form is related 
to the scalar product w ' wi (= w‘ 2 W }) as follows: 


6 is a(n) 


acute 1 


right 

obtuse 


angle if and only if wj W 2 



Verify this by computing the scalar product for each of the following pair of vectors (see 
Figs. 4.2 and 4.3): 


( 0 ) w^ = 



( b ) W] = 

'1 ' 

4_ 

, VV 2 = 

(C) W] = 

3’ 

r 

2 

; W 2 = 


-3 

-2 

-3' 

-2 


(d) W] = 


(e) w, = 


V 


O' 

0 

, w 2 = 

2 

_o. 


.0. 

"1' 


'1 ' 

2 

= 

2 

.2_ 


0 


5. Given u = 


5 

1 


and v = 



, find the following graphically: 


( 0 ) 2v (c) u-v (e) 2 u + 3v 

(b) u - v (d) v - u (f) Au - 2v 

6, Since the 3-space is spanned by the three unit vectors defined in (4.7), any other 
3-vector should be expressible as a linear combination ofe\,e 2 , and eg- Show that the 
following 3-vectors can be so expressed: 


'4' 

7 

m 

r 

cn 

I 

\ _ 

(c) 

'-1 " 

6 

<c0 

'2' 

0 

_0_ 


i_ 


9 


8 


7. In the three-dimensional Euclidean space, what is the.distance between the following 
points? 

(a) (3, 2, 8) and (0, -1,5) (b) (9, 0, 4) and (2,0, -4) 

8. The triangular inequality is written with the weak inequality sign < r rather than the 
strict inequality sign <. Under what circumstances would the "=" part of the inequal¬ 
ity apply? 

9. Express the length of a radius vector vin the Euclidean n-space (i.e., the distance from 
the origin to point v) by using each of the following: 

(a) scalars (b) a scalar product (c) an inner product 
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4.4 Commutative, Associative, and Distributive Laws 


In ordinary scalar algebra, the additive and multiplicative operations obey the commuta¬ 
tive, associative, and distributive laws as follows: 


Commutative law of addition: 
Commutative law of multiplication: 
Associative law of addition: 
Associative law of multiplication: 
Distributive law: 


a + b = b + a 
ah = ha 

(a -b b) +c = a + {/; + c) 
(ab)c = a (be) 
a(bA-c) = ab + ac 


These have been referred to during the discussion of the similarly named laws applieable to 
the union and intersection of sets. Most, but not all, of these laws also apply to matrix 
operations—the significant exception being the commutative law of multiplication. 


Matrix Addition 

Matrix addition is commutative as well as associative. This follows from the fact that ma¬ 
trix addition calls only for the addition of the corresponding elements of two matrices, and 
that the order in which each pair of corresponding elements is added is immaterial. In this 
context, incidentally, the subtraction operation A - B can simply be regarded as the addi¬ 
tion operation A + (-5), and thus no separate discussion is necessary 
The commutative and associative laws can be stated as follows: 


Commutative law A A- B = B ■+■ A 

Proof A + B = [«,,] + [/>,-, ] = [a h + h u ] = [ft y + «,,•] = B + A 


Example 1 


Given A - 


3 1 

0 2 


and B = 


6 2 
3 4 


, we find that 


A+B=B+A= 


9 3' 
3 6 


Associative law (A + B) + C = A + (£ + C) 

Proof (A +■ B) + C = [a y + b,j\ + [c,y] = [a,-, + b :j a - e,-J 

— [<ty] + [bjj + c,j} = A A- {B A- C) 


Example 2 


Given V] = 



2 

5 


we find that 


which is equal to 


(Vl + V 2 ) -Vj = 


- 

2 


'10' 


5_ 


_ 0. 


Vi + (v 2 - v 3 ) 



10 

0 


Applied to the linear combination of vectors k]V\ h -h k n v n , the associative law per¬ 

mits us to select any pair of terms for addition (or subtraction) first, instead of having to fol¬ 
low the sequence in which the n terms are listed. 
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Example 3 


Example 4 


Example 5 


Matrix Multiplication 

Matrix multiplication is not commutative, that is, 

AB ± 


As explained previously, even when AB is defined, BA may not be; but even if both prod¬ 
ucts are defined, the general rule is still AB ^ BA. 


Let A = 


] 4] and8 = [ 


0 -1 
6 7 


; then 


AB = 


1(0)+2(6) 1 (—1) + 2(7) 
3(0)44(6) 3(—1) + 4(7) j 


12 13 
24 25 


but 


BA = 


'0(1) — 1(3) 0(2) - 1(4)" 


■-3 -4' 

.6(1)+ 7(3) 6(2)+7(4). 


. 27 40. 


Let i/ be 1 x 3 (a row vector); then the corresponding column vector u must be 3 x 1. The 
product u‘u will be 1 x 1, but the product uu' will be 3 x 3. Thus, obviously, u'u + ou'. 


In view of the general rule AB ^ 6/1, the terms premultiply and postmultiply are often 
used to specify the order of multiplication. In the product AB, the matrix B is said to be 
/wemultiplied by A, andvl to bc/w.wmultiplied by B. 

There do exist interesting exceptions to the rule AB / BA, however. One such case is 
when /lisa square matrix and B is an identity matrix. Another is when A is the inverse of 
B, that is, when A = 6 -1 . Both of these will be taken up again later. It should also he 
remarked here that the scalar multiplication of a matrix does obey the commutative law; 
thus, if k is a scalar, then 

kA = Ak 

Although it is not in general commutative, matrix multiplication w associative. 
Associative law (AB)C = A(BC) — ABC 

In forming the product ABC, the eonformability condition must naturally be satisfied by 
each adjacent pair of matrices. If A is ni x n and if C is p x q, then eonformability requires 
that 6 be n x p: ABC 

(?HXW) (H *[)) {fyxtj) 

Note the dual appearance of n and p in the dimension indicators. If the eonformability con¬ 
dition is met, the associative law states that any adjacent pair of matrices may be multiplied 
out first, provided that the product is duly inserted in the exact place of the original pair. 


If x = 


*i 

*2 


and A = 


on 0 
0 022 


then 


x'Ax = x'(Ax) - [xi jq] 


an*i 

022*2 


= 011 *? + 022*2 


Exactly the same result comes from 

(.x'A)x= [onxi 022 * 2 ] 


*1 

*2 


= an*?+ 022*1 
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In Example 5, the square matrix A has nonzero elements «[. and 022 m fhc principal 
diagonal, and zeros everywhere else. Such a matrix is called a diagonal matrix. When a 
diagonal matrix A appears in the product x' Ax. the resulting product gives a "weighted’' 
sum of squares, the weights for the x] and the as 2 terms being supplied by the elements in 
the diagonal of A, This result is in contrast to the scalar product x'x, which yields a simple 
(unweighted) sum of squares. 


Example 6 


Let the economic ideal be defined as the national-income level V 9 coupled with the inflation 
rate p°. And suppose that we view any positive deviation of the actual income Y from h 3 to 
be equally undesirable as a negative deviation of the same magnitude, and similarly for 
deviations of the actual inflation rate p from p°. Then we may write a social-loss function 
such as 


\=u(Y-Y°) 2 + 0(p-p 0 ) 2 

where a and 0 are the weights assigned to the two sources of social loss. If deviations of V 
are considered to be the more serious type of loss, then a should exceed ft. Note that the 
squaring of the deviations produces two effects, first, upon squaring, a positive deviation 
will receive the same loss value as a negative deviation of the same numerical magnitude. 
Second, squaring causes the larger deviations to show up much more significantly in the 
social-loss measure than minor deviations. Such a social-loss function can be expressed, if 
desired, by the matrix product 


[Y-Y 0 


P-P°) 


'a O' 

r y- k°i 

.0 fl m 

- 1 

0 

0 , 

1 

Q_ 
_1 


Matrix multiplication is also distributive. 

Distributive law A(B A- C) — AB + AC [premultiplication by A] 

(B + C)A = BA + CA [postmultiplication by A] 

In each case, the conformability conditions for addition as well as for multiplication must, 
of course, be observed. 


EXERCISE 4.4 


1, Given A = 


3 6 
2 A 


,8 


'-1 7 
8 4 


, and C = 


3 4 
1 9 


, verify that 


(g) (A+ B)+ C = A + (8 + C) 

(b) (A -f 8) - C = A + (8 - C) 

2. The subtraction of a matrix B may be considered as the addition of the matrix (-1)8. 
Does the commutative law of addition permit us to state that A - B ~ 8 - A? If not, 
how would you correct the statement? 

3. Test the associative law of multiplication with the following matrices: 


A = 


■5 3' 

r. T -8 0 7" 

C = 

'l 

o' 

0 5_ 

H 1 3 2 

0 

3 




1 

1 
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4. Prove that for any two scalars g and k 

(a) k(A + B) = kA+kB 

(b) ( g+k)A = gA + kA 

(Note: To proven result, you cannot use specific examples.) 

5. For(o) through (d)find C = AB. 


(a) A 

( b) A 


12 141 

B 


'3 

9' 

20 sj 

— 

0 

2 

4 71 

B = 

'3 

8 

5 

9 ij 

_2 

6 

7 


(e) A = 


7 11 
2 9 

10 6 


(d)A = 


6 2 5 

7 9 4 


8 = 


'12 4 5 
3 6 1 


B = 


10 1 
11 3 
2 9 


(i e ) Find (i) C = AB, and (ii) D = BA, if 


4 = 


-2 

4 

7 


fi =[3 6 -2] 


6. Prove that (4 + 8)(C + D) = 4C - 40 + BC +. BD. 

7. if the matrix A in Example 5 had all its four elements nonzero, would x'Ax still give a 
weighted sum of squares? Would the associative law still apply? 

8. Name some situations or contexts where the notion of a weighted or unweighted sum 
of squares may be relevant. 


4.5 Identity Matrices and Null Matrices __ 

Identity Matrices 

We have referred earlier to the term identity matrix. Such a matrix is defined as a square 
(repeal: square) matrix with Is in its principal diagonal and Os everywhere else. It is de¬ 
noted by the symbol /, or I n , in which the subscript n serves to indicate its row (as well as 
column) dimension. Thus, 



'l O' 

0 1 


~ 1 

0 
_1 

| 

h = 

0 

1 0 



0 

0 |_ 


But both of these can also be denoted by I. 

The importance of this special type of matrix lies in the fact that it plays a role similar 
to that of the number I in scalar algebra. For any number a, we have 1(<0 = a( I) = a. Sim¬ 
ilarly. for any matrix A. we have 


1A = A! = A 


(4.8) 
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Example 1 


Let A = 


1 2 
1 0 


then 


1 0 




0 1 


1 2 3 

2 0 3 


'1 2 3 
2 0 3 


- A 


1 2 3 




2 0 3 


1 0 0 
0 1 0 
0 0 1 


1 2 3 

2 0 3 


Because A is 2 x 3, premultiplication and postm triplication of A by / would call for identity 
matrices of different dimensions, namely, h and / 3 , respectively. But in case A is n x n, then 
the same identity matrix /„ can be used, so that (4.8) becomes t r A = Al„, thus illustrating 
an exception to the rule that matrix multiplication is not commutative. 


The special nature of identity matrices makes it possible, during the multiplication 
process, to insert or delete an identity matrix without affecting the matrix product. This 
follows directly from (4.8). Recalling the associative law, we have, for instance, 

A I B = {AI)B = A B 

(»Xf>) {ntXn) f'jX/O 

which shows that the presence or absence of I does not affect the product. Observe that 
dimension conformability is preserved whether or not I appears in the product. 

An interesting case of (4.8) occurs when A = I n , for then we have 

Ah = (hf = I„ 

which states that an identity matrix squared is equal to itself. A generalization of this result 
is that 

(/«)* = /« (k= 1 , 2 ,...) 

An identity matrix remains unchanged when it is multiplied by itself any number of times. 
Any matrix with such a property (namely. A A = A ) is referred to as an idempoteni matrix. 


Null Matrices 


Just as an identity matrix l plays the role of the number 1, a null matrix —or zero matrix — 
denoted by 0, plays the role of the number 0. A null matrix is simply a matrix whose 
elements are all zero. Unlike /, the zero matrix is not restricted to being square. Thus it is 
possible to write 


0 = 

(2x2) 


0 0 
0 0 


and 0 

(2x3) 


0 0 0 
0 0 0 


and so forth. A square null matrix is idempotent, but a nonsquare one is not. (Why?) 

As the counterpart of the number 0, null matrices obey the following ruies of operation 
(subject to conformability) with regard to addition and multiplication: 


A + 0 = Q + A= A 

Imxn) {nixn) [mxn) [mxn) 


o A 

iqxni) 


o 


A 0 = 0 

(;» xn) in xp) IWX {>) 


and 
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Note that, in multiplication, the null matrix to the left of the equals sign and the one to the 
right may be of different dimensions. 


Example 2 4 + 0 = 

Oil O12 

+ 

0 0 


011 012 


Q21 022 


0 0 


, 0 2 i 0 22 J 


= A 


Example 3 


A 0 

(2x3) (3.1) 


oil Ol 2 Oi3 
0 2 1 0 2 2 0 2 3 


0 

0 

0 


0 

0 


0 

(2x1) 


To the left, the null matrix is a 3 x 1 null vector; to the right, it is a 2 x 1 null vector. 


Idiosyncrasies of Matrix Algebra 

Despite the apparent similarities between matrix algebra and scalar algebra, the case of 
matrices does display certain idiosyncrasies that serve to warn us not to "borrow” from 
scalar algebra too unquestioning!}-. We have already seen that, in general, AB 4 BA in 
matrix algebra. Lot us look at two more such idiosyncrasies of matrix algebra. 

For one thing, in the case of scalars, the equation uh = 0 always implies that either« or 
b is zero, but this is not so in matrix multiplication. Thus, we have 


'2 4' 

* —2 

4' 


'0 

o' 

1 2 . 

I 

_2 


0 

0 


although neither A nor B is itself a zero matrix. 

As another illustration, for scalars, the equation cd = ce (with c 4 0) implies that 
d = e. The same docs not hold for matrices, Thus, given 


wc find that 


'2 3" 


■ 1 r 


'-2 i' 


D — 


E = 


■ a 


1 2 

u ■ 


3 2 


CD = CE 


5 8 

15 24 


even though D 4 E. 

These strange results actually pertain only to the special class of matrices known as 
singular matrices, of which the matrices A, fi, and Care examples. (Roughly, these matri¬ 
ces contain a row which is a multiple of another row.) Nevertheless, such examples do 
reveal the pitfalls of unwarranted extension of algebraic theorems to matrix operations. 


EXERCISE 4.5 

Given A — 

1. Calculate: (a) Al (b) IA (c) lx (d) x'l 

Indicate the dimension of the identity matrix used in each case. 


-15 7 

, b - 

6 

, and x = 


0-2 4 


n 

T 

*2 m 
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2. Calculate: (a) Ab (b) Alb ( c)x'IA (d) x'A 

Does the insertion of / in (b) affect the result in (a)? Does the deletion of I in ( d ) affect 
the result in (c)? 

3. What is the dimension of the null matrix resulting from each of the following? 

(o) Premultiply A by a 5 x 2 null matrix. 

(b) Postmultiply A by a 3 x 6 null matrix. 

(c) Premultiply b by a 2 x 3 null matrix. 

(i d ) Postmultiply x by a 1 x 5 null matrix. 

4. Show that the diagonal matrix 

On 0 • • • 0 

0 022 ''• 0 

„ o 0 ■■■ o nn , 

can be idempotent only if each diagonal element is either 1 or 0. How many different 
numerical idempotent diagonal matrices of dimension /? x n can be constructed alto¬ 
gether from such a matrix? 


4.6 Transposes and Inverses _ 

When the rows and columns of a matrix A are interchanged - so that its first row becomes 
the first column, and vice versa—we obtain the transpose of A, which is denoted by A' or 
A 1 . The prime symbol is by no means new' to us; it was used earlier to distinguish a row 
vector from a column vector. In the newly introduced terminology, a row vector x' consti¬ 
tutes the transpose of the column vector x. The superscript T in the alternative symbol is 
obviously shorthand for the word transpose. 


Example 1 


Given A 

(2x3) 


3 8 
1 0 


columns and write 


-9 

4 


and B 

(2x2) 


'3 4 
1 7 


we can interchange the rows and 


A" « 

1 

L*J 

O —» 
_ 1 

and S =R \ 

(3x2) 

_-9 4. 

(2x2) L 4 . 


By definition, if a matrix A is m x rt, then its transpose A‘ must be n x m. An n x n square 
matrix, however, possesses a transpose with the same dimension. 


Example 2 


if c = 


9 

2 


and D = 


1 

C 

4 



0 4 
3 7 
7 2 

9 2 
1 0 


, then 


and 


D' = 


1 0 
0 3 
4 7 


4 

7 

2 


Here, the dimension of each transpose is identical with that of the original matrix. 
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In D\ we also note the remarkable result that D' inherits not only the dimension of D 
but also the original array of elements! The fact that D 1 = D is the result of the symmetry 
of the elements with reference to the principal diagonal. Considering the principal diago¬ 
nal in D as a mirror, the elements located to its northeast are exact images of the elements 
to its southwest; hence the first row' reads identically with the first column, and so forth. The 
matrix D exemplifies the special class of square matrices known as symmetric matrices. 
Another example of such a matrix is the identity matrix /, which, as a symmetric matrix, 
has the transpose /' = /. 

Properties of Transposes 

The following properties characterize transposes: 

M')' - A (4.9) 

(A + B)' = A , + B' (4.10) 

{AB)' = B'A' (4.11) 

The first says that the transpose of the transpose is the original matrix—a rather self- 
evident conclusion. 

The second property may be verbally stated thus: The transpose of a sum is the sum of 
the transposes. 


Example 3 


If A = 


4 1 
9 0 


and B = 


2 0 
7 1 


, then 


and 


(A + B)' 


6 1" 

1 

"6 16 

16 K 


Li i 


A+8' = 


4 9 
1 0 


T 


'6 16' 

1 


1 1 


The third property is that the transpose of a product is the product of the transposes in 
reverse order. To appreciate the necessity for the reversed order, let us examine the dimen¬ 
sion conformability of the two products on the two sides of (4.11). If we let A be m x n and 
B be n x p. then AB will bemxp, and ( AB)' will be p x m. for equality to hold, it is 
necessary that the right-hand expression B'A' be of the identical dimension. Since B 1 is 
p x n and A' is n x m, the product B'A' is indeed p x m, as required. The dimension of 
B' 4' thus works out. Note that, on the other hand, the product A'B ! is not even defined 
unless m = p. 


Example 4 Civen A = 


and B = 


0 -1 
6 7 


, we have 


(AB)' = 

and B'A' = 


12 

13' 

t 

'12 24' 

24 

25 


13 25 


0 6' 

'1 3' 


’12 

24' 

1 7 

2 4 


13 

25 


This verifies the property. 
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Inverses and Their Properties 

For a given matrix A. the transpose A' is always derivable. On the other hand, its inverse 
matrix—another type of “derived" matrix—may or may not exist. The inverse of matrix A, 
denoted by A~\ is defined only if A is a square matrix, in which case the inverse is the 
matrix that satisfies the condition 


AA~ ] = A~ l A = I (4.12) 

That is, whether .4 ispre- or postmultiplied by A~ l , the product will be the same identity 
matrix. This is another exception to the rule that matrix multiplication is not commutative. 
The following points are worth noting: 

1. Not every square matrix has an inverse—squareness is a necessary condition, but not a 
■sufficient condition, for The existence of an inverse. If a square matrix A has an inverse, 
A is said to be nonsingular; if A possesses no inverse, it is called a singular matrix. 

2. If A~ l does exist, then the matrix ,4 can be regarded as the inverse of .4“', just as .4“' 
is the inverse of A. In short, A and A~ ] are inverses of each other. 

3. If A is n x a, then A~' must also be n x n ; otherwise it cannot be conformable for both 
pre- and postmultiplication. The identity matrix produced by the multiplication will also 
be n x n. 

4. if an inverse exists, then it is unique. To prove its uniqueness, let us suppose that B has 
been found to be an inverse for A, so that 

AB = BA = I 

Now assume that there is another matrix C such that AC — CA — I. By premultiplying 
both sides of A B = 1 by C, we find that 

CAB = CI(=C) [by (4.8)] 

Since CA = I by assumption, the preceding equation is reducible to 

IB = C or B = C 

That i s, B and C must be one and the same inverse matrix. For this reason, we can speak 
of the (as against an) inverse of A. 

5. The two parts of condition (4,12)—namely, AA~ ] = I and A~ l A = l —actually imply 
each other, so that satisfying either equation is sufficient to establish the inverse rela¬ 
tionship between A and A~ l . To prove this, we should show that if AA~' = i. and 
if there is a matrix 8 such that BA = /, then B = A~' (so that BA = I must in effect be 
the equation A~' i A — !). Let us postmultiply both sides of the given equation BA = I 
by A~ ] ; then 

(. BA)A ~ 1 = IA~' 

B(AA~ l ) = lA~ l [associative law] 

BI = IA " 1 [A A 1 = /by assumption] 


Therefore, as required, 


B = A~' 


[by (4.8)] 
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Analogously, it can be demonstrated that, if A '/! = /. then the only matrix C which 
yields CA~ l = J is C = A. 


Example 5 


Let A = 


' 3 1 

1 and 8 = - 

*2 -T 

0 2 

!j 6 

0 3 


; then, since the scalar multiplier (1) in B can be 


moved to the rear (commutative law), we can write 


rs n 

'2 -1 ' 

1 

i 

C*\ 

o 

_1 

( 1 ' 

'1 o' 

fN 

O 

_i 

0 3 

6 “ 

_0 6 

i 6 

0 1 _ 


This establishes B as the inverse of A, and vice versa. The reverse multiplication, as expected, 
also yields the same identity matrix: 


'2 

-i i r 

3 1 1 

1 

r6 o' 


'1 

o' 

0 

i_ 

O 

i _ 

- 6 

i — 

o 

Ox 

i_ 


1 

o 

i 


The following three properties of inverse matrices are of interest. If A and B arc nonsin- 
gular matrices with dimension n x n, then 


M-'r 

= A 

(4.13) 

(AB)-' 

= b-'a~ ] 

(4.14) 

(AT 1 

= {A ')’ 

(4.15) 


The first says that the inverse of an inverse is the original matrix. The second states that 
the inverse of a product is the product of the inverses in reverse order. And the last one 
means that the inverse of the transpose is the transpose of the inverse. Note that in these 
statements the existence of the inverses and the satisfaction of the conformubility condition 
are presupposed. 

The validity of (4.13) is fairly obvious, but let us prove (4.14) and (4.15). Given the 
product AB, let us find its inverse—call it C. From (4.12) we know that CAB = 1: thus, 
postmultiplication of both sides by B 1 A 'will yield 

CABB~'A~ l = IB [ A [ {=B~ ] A~ [ ) (4.16) 

but the left side is reducible to 

CA(BB~ l )A 1 = CA1A~ ] [by (4.12)] 

= CAA~ l =CI = C [by (4.12) and (4.8)1 

Substitution of this into (4,16) then tells us that C = B~ ] A~ ] or, in other words, that the 
inverse of AB is equal to B ] A~ ] . as alleged. In this proof, the equation A4~ [ = 
A~' A — I was utilised twice. Note that the application of this equation is permissible if 
and only if a matrix and its inverse are strictly adjacent to each other in a product. We nuty 
write AA~'B = IB = S, but never ABA~' = B. 

The proof of (4.15) is as follows. Given A’, let us find its inverse—call it D. By defini¬ 
tion, we then have DA' = 1. But we know that 

(AA l )' = /' = 1 
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produces the same identity matrix, Thus we may write 

DA’ = {A A~ x )' 

= {A- i )‘A' [by (4.11)] 

Postmultiplying both sides by we obtain 

DA'(A') 1 - 

or D = (A~')' [by (4.12)1 

Thus, the inverse oi' A' is equal to (A as alleged. 

In the proofs just presented, mathematical operations were performed on whole blocks 
of numbers. Tf those blocks of numbers had not been treated as mathematical entities (ma¬ 
trices), the same operations would have been much more lengthy and involved. The beauty 
of matrix algebra lies precisely in its simplification of such operations. 

Inverse Matrix and Solution of Linear-Equation System 

The application of the concept of inverse matrix to the solution of a simultaneous-equation 
system is immediate and direct. Referring to the equation system in (4.3), we pointed out 
earlier that it can be written in matrix notation as 

A x = d (4.17) 

(3x3) (3sl) fix 1) 

where /1.x, and d are as defined in (4.4). Now if the inverse matrix A~ ] exists, the prcmul- 
tiplicalion of both sides of the equation (4.17) by 4 -1 will yield 

A~ x Ax = A~ ] ci 


or 


x = A 1 d 

(3x1) (3-3) (3x]) 


(4.1ft) 


The left side of (4.18) is a column vector of variables, whereas the right-hand product is a 
column vector of certain known numbers. Thus, by definition of (he equality of matrices or 
vectors, (4.18) shows the set of values of the variables that satisfy (he equation system, i.e., 
the solution values, Furthermore, since A' 1 is unique if it exists. A~ ] d must bo a unique 
vector of solution values. We shall therefore write the x vector in (4.18) as x*< to indicate 
its status as a (unique) solution, 

Methods of testing the existence of the inverse and of its calculation will be discussed in 
Chap. 5. It may be stated here, however, that the inverse of the matrix A in (4,4) is 


A~ l 

Thus (4.18) will turn out to be 
f v*“l 

'' -l 

'l 52 


18 

-16 

-10 

-13 

26 

13 

-17 

18 

21 


1 

o 

s© 

oo 


’22 . 

i 


2’ 

-13 26 13 


12 

— 

3 

_-17 18 21. 


_L0_ 


1 


which gives the solution: x* = 2, x\ = 3, and x-* = 1. 
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The upshot is that, as one way of finding the solution of a linear-equation system 
Ax — d , where the coefficient matrix A is nonsingular, is to first find the inverse A~\ and 
then postmultiply A~' by the constant vector cl. The product A' ] d will then give the solu¬ 
tion values of the variables. 


Example 6 


As shown in Example 11 of Sec. 4.2, the simple national-income model 

Y = C + /o + Go 
C = a - bY 

can be written in matrix notation as Ax = d, where 


A ~ 


1 -1 

-b 1 




and d = 


lo 4- Co 
o 


The inverse of matrix A is (see explanation in Sec. 5.6) 

1 


A-' = 


1 - b 


1 1 
b 1 


Thus the solution of the model is x‘ = A 1 d, or 

[S] 


1 

'1 T 

h - Co 

i 

Iq + Gq + O 

1 -b 

b 1 

a 

- 1 -b 

_b (/ 0 + Go) + 


EXERCISE 4.6 


1. Given A = 


0 4] 

, B = 

'3 -8" 

, and C = 

'1 0 9 
|_6 1 1 _ 

.-1 3 j 

f v 

0 1_ 



, find A', 8', and C'. 


2. Use the matrices given in Prob. 1 to verify that 

(c)(A + B)'=A' + B' (b) (AC)' = C'A' 

3. Generalize the result (4.11) to the case of a product of three matrices by proving that, 
for any conformable matrices A, B, and C, the equation (ABC)' = C'B A' holds. 

4. Given the following four matrices, test whether any one of them is the inverse of 
another: 

D -[ 

5. Generalize the result (4.14) by proving that, for any conformable nonsingular matrices 
A, B, and C, the equation (ABC) 1 = C _1 B _1 A 1 holds. 

6. Let A = /-X(X'X) -1 X'. 

(a) Must A be square? Must(X'X) be square? Must X be square? 

(b) Show that matrix A is idempotent. [Note: If X 1 and X are not square, it is inappro¬ 
priate to apply (4.14).] 


f - 


1 1 

6 8 


F = 


-4 


3 J 


c = 


-2 


-3 


4.7 Finite Markov Chains 


A common application of matrix algebra is found in what is known as Markov processes or 
Markov chains. Markov processes arc used lo measure or estimate movements over time. 
This involves the use of a Markov transition matrix, where each value in the transition 
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matrix is a probability of moving from one state (location, job, etc.) to another state. There 
is also a vector containing the initial distribution across the various states. By repeatedly 
multiplying such a vector by the transition matrix, one can estimate changes across stales 
over time. 

Consider the problem of internal employee movement within a company that has many 
ditferent branches, or outlets. 4 A simple illustration using two branches, such as Abbotsford 
and Burnaby, will help to demonstrate the basics of a Markov process. To determine the 
number of employees in Abbotsford tomorrow, we take the probability that the employees 
will stay in the Abbotsford branch multiplied by the total number of employees currently in 
Abbotsford which gives the total number ol' currem Abbotsford employees who will 
remain tomorrow. Added to this number is the number of Burnaby employees transferring 
to Abbotsford. This number is found by multiplying the total number of current Burnaby 
employees by the probability of a Burnaby employee transferring to Abbotsford. Similarly 
the process would be the same for determining the number of employees in the Burnaby 
region tomorrow, made up of those Burnaby employees who chose to remain and the 
Abbotsford employees who transfer into the Burnaby region today. The process described 
involves four probabilities. These four probabilities together can be arranged in a matrix. 
This is known as a Markov transition matrix (or simply, a “Markov”). 

Let Aj and B, denote the populations of Abbotsford and Burnaby, respectively, at some 
time, /. Further, define the transitional probabilities as follows 

Pa .4 = probability that a current A remains an A 

Pas = probability that a current A moves to B 

Pan = probability that a current B remains a B 

P$ , j = probability that a current /? moves to A. 

If wc denote the distribution of employees across locations at time t as a vector 

x ; = [j, a,] 

and the transitional probabilities in matrix form 


M = 


P.4A PaB 
Pba Pbb 


then the distribution of employees across locations in the next period (f ■+ 1) is 

a-; m = 

(. 1 x 2 ) (2x2) ,1x2) 

= K^Paa + K ! Pba)(A i P A b + B'Pbb)] 
— [A l+ \ 5,+]] 


[A, B, 


Pa a Pab 
Pba Pbb 


* We would like to thank Sarah Dunn for this example. This work comes from her final project while a 
student at the British Columbia Institute of Technology, Burnaby, BC, Canada (June 2003). 
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Example 1 


To find the distribution of employees after two periods 


E A, B f ] 


In general, for n periods 


K+1 


Paa 

Pba 

P.AB 
Pbb _ 

= Mm2 

B l+ i\ 

'Paa 

Pba 

Pah 

Pii ii _ 

1-1 

X. 

Pab 
Pbb r 

= Mr + 2 

B l+ i] 

[A, 

B t ] 

'Paa 

m P$A 

Pah" 

Pun 

) 

= M'+2 

b,+ 2 ] 

uJPaa ' 

D Y 

AH 

_ r a 

D 1 



Pba Pun 


7 + tl 


The 2x2 probability matrix M is known as the Markov transition matrix. For the case 
where n is exogenous, the process is known as a finite Markov chain. 


Suppose the initial distribution of employees across the two locations at time t = 0 is 


* 0 = Mo So] = [100 100] 

In other words, there are initially equal numbers at each location. Further, let the transi¬ 
tional probabilities in matrix form be as follows: 


Paa 

Pab' 

[0,7 

0.3' 

, Pba 

Pbb . 

_ [o,4 

0.6 


Then the distribution of employees across locations in the next period (t = 1) is 


[100 100 ] 


0.7 0.3 
0.4 0.6 


[110 90] = [4, fii] 


The distribution after two periods is given by 


[100 100 ] 


0.7 

0.4 



= [100 


100 ] 


0.61 

0.52 


0.39 

0.48 


= [113 87] = [4 2 fl 2 ] 


The distribution after 10 periods (t = 10) Is given by 


[100 100] 


1 l 

© o 

VI 

0.3' 

0.6_ 

o 

O 

o 

100] [°- 51 74 
IUUJ [0.5174 

0.4286' 

0.4286 



- [114.3 

85.7] = [4, 0 

Bio] 


Notice what happens when the Markov transition matrix is raised to higher and higher pow¬ 
ers. The new transition matrix found by raising the original matrix to increasingly higher 
powers converges to a matrix where the rows are identical. This is referred to as the steady 
state. What would you expect the eleventh or higher periods of distribution to look like? 
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Special Case: Absorbing Markov Chains 

Now, let us extend the model by adding a third option; Employees ean exit the company, 
with 

Pah — probability that a current/! chooses to exit <£) 

P&n = probability that a current B chooses to exit (E) 

At this point, we will add the following assumptions: 

Pea = 0 P hn = 0 P u: = 1 

where P KA , Pea, and P E e are the probabilities that an employee who is currently an E will 
go to A, B,otE, respectively. In other words, nobody who leaves the company ever returns. 
It is also implied by these restrictions that our company never replaces employees that leave 
(there are no new hires). 


Starting at time t 

/ 

= 0. cmr Markov chain now becomes 





’ Paa 

Pab 

Pah 

n 


Mo 

Bo 

Eo] 

Pi) A 

Pbb 

Pbh 

— 

[A a B„ E„] 




.Pea 

Phh 

Pm _ 






Paa 

Pab 

Pae~ 

1" 


Mo 

Bo 

foil 

PSA 

Pbb 

Pbe 

— 

M« E„\ 




0 

0 

1 



(Assume £<j = 0.) 









This type of Markov process is referred to as an absorbing Markov chain. Because of the 
values of the transition probabilities found in the third row, we sec that once an employee 
becomes an E in one state (time period) that employee will remain there for all future states 
(time periods). As n goes to infinity, A n and B n will approach zero and F. rl will approach 
the value of the total number of workers at time zero (i.e., At, + Bo + Ed). 


EXERCISE 4.7 

1. Consider the situation of a mass layoff (i.e., a factory shuts down) where 1,200 people 
become unemployed and now begin a job search. In this case there are two states: 
employed {£) and unemployed (LI) with an initial vector 

*o = [£ U] = [0 1,200] 

Suppose that in any given period an unemployed person will find a job with probabil¬ 
ity .7 and will therefore remain unemployed with a probability of .3. Additionally, 
persons who find themselves employed in any given period may lose their job with a 
probability of .1 (and will have a .9 probability of remaining employed). 

(a) Set up the Markov transition matrix for this problem. 

( b ) What will be the number of unemployed people after (i) 2 periods; (ii) 3 periods; 
(iii) 5 periods; (iv) 10 periods? 

(c) What is the steady-state level of unemployment? 




Chapter 


Linear Models and Matrix 
Algebra (Continued) 



In Chap. 4, it was shown that a linear-equation system, however large, may be written in a 
compact matrix notation. Furthermore, such an equation system can be solved by finding 
the inverse of the coefficient matnx, provided the inverse exists. Now we must address our¬ 
selves to the questions of how to test for the existence of the inverse and how to find that 
inverse. Only after wc have answered these questions will it be possible to apply matrix 
algebra meaningfully to economic models. 

5,1 Conditions for Nonsingularity of a Matrix __ 

A given coefficient matrix A can have an inverse (i.c., can be "'nonsingular'’) only if it is 
square. As was pointed out earlier, however, the squareness condition is necessary but not 
sufficient for the existence of the inverse A~ . A matrix can be square, but singular (with¬ 
out an inverse) nonetheless. 

Necessary versus Sufficient Conditions 

The concepts of “necessary condition” and “sufficient condition” are used frequently in 
economics. It is important that we understand their precise meanings before proceeding 
further. 

A necessary condition is in the nature of a prerequisite: Suppose that a statement/; is 
true only if another statement q is true; then </ constitutes a necessary condition of p. Sym¬ 
bolically. we express this as follows: 

P^q (5-1) 

which is read as "p only ifq,” or alternatively, “ifp, then qi' Jl is also logically correct to 
interpret (5.1) to mean "p implies q” It may happen, of course, that wc also have p => vv 
at the same time. Then both q and vv arc necessary conditions for p. 

If we let p be the statement "a person is a father" and q be the statement "a person is male/' 
then the logical statement p =s q applies. A person is a father only if he is male, and to be 
male is a necessary condition for fatherhood. Note, however, that the converse is not true: 
fatherhood is not a necessary condition for maleness. 


Example 1 
82 
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A different type of situation is one in which a statement p is true if q is true, but p can 
also be true when q is not true. In this case, q is said to be a sufficient condition for p. The 
truth of q suffices to establish the truth ofp, but it is not a necessary condition for p. This 
case is expressed symbolically by 


P *= q (5.2) 

which is read: “p ifq" (without the word only) —or alternatively, “if*/, then pf as if read¬ 
ing (5.2) backward. It can also be interpreted to mean “q implies p ” 


Example 2 ^ we ' el P slatement "one can get to Europe" and q be the statement "one takes a 

- plane to Europe/' then p <= q. Flying can serve to get one to Europe, but since ocean trans¬ 
portation is also feasible, flying is not a prerequisite. We can write p ^ q, but not p=*q. 

In a third possible situation, q is both necessary and sufficient for p. In such an event, 
we write 


P*q (5.3) 

which is read: "p if and only if (also written as “ p iff q”). The double-headed arrow is 
really a combination of the two types of arrow in (5.1) and (5.2), hcncc the joint use of the 
two terms “if” and “only if” Note that (5.3) slates not only that p implies q but also that q 
implies p. 


Example 3 


If we let p be the statement "there are less than 30 days in the month" and q be the state¬ 
ment "it is the month of February/' then p& q. To have less than 30 days in the month, it 
is necessary that it be February. Conversely, the specification of February is sufficient to es¬ 
tablish that there are less than 30 days in the month. Thus q is a necessary-and-sufficient 
condition for p. 

In order to prove p => q, it needs to be shown that q follows logically from p. Similarly, 
to prove p^q requires a demonstration that p follows logically from q, But to prove q 
necessitates a demonstration that p and q follow from each other. 


Necessary conditions and sufficient conditions arc important as screening devices. Con¬ 
sider a pool of applicants being considered for scholarship awards, or for job positions. 
Since necessary conditions are in the nature of prerequisites, they serve to separate the can¬ 
didates into two groups: Those who fail to meet the necessary conditions are automatically 
disqualified; those who satisfy the necessary conditions remain as admissible candidates. 
To remain as an admissible candidate, however, carries no guarantee that the candidate will 
eventually be successful. Thus, necessary conditions arc more conclusive in screening out 
the unsuccessful candidates than in identifying the successful ones. In general, we should 
bear in mind that necessary conditions are not in themselves sufficient. 

In contrast to necessary conditions, sufficient conditions serve directly to identify suc¬ 
cessful candidates. A candidate that satisfies a sufficient condition is automatically a 
successful one. Just as necessary conditions are not in themselves sufficient, sufficient con¬ 
ditions arc not in themselves necessary. This is because, along with any given sufficient 
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condition, there may exist other, less stringent, sufficient conditions, and the candidate who 
fails to satisfy the given sufficient condition may yet qualify under an easier sufficient con¬ 
dition. For example, a grade of A is sufficient for passing a course, but it is not a necessary 
condition since a grade of B is also sufficient. 

The most effective screening device is found in the nccessary-and-sufficient conditions. 
Failure to satisfy such a condition means the candidate is definitely out, and satisfaction of 
such a condition means the candidate is definitely in. We can find an immediate application 
of this in our present discussion of nonsingularity of a matrix. 


Conditions for Nonsingularity 

After the squareness condition (a necessary condition) is already met, a sufficient condition 
for the nonsingularity of a matrix is that its rows be linearly independent (or, what amounts 
to the same thing, that its columns be linearly independent). When the dual conditions 
of squareness and linear independence are taken together, they constitute the necessury- 
and-sufficient condition for nonsingularity (nonsingularity <=> squareness and linear 
independence). 

Ann x n coefficient matrix A can be considered as an ordered set of row vectors, i.c., as 
a column vector whose dements are themselves row vectors: 


Example 4 



"11 

a\i • 

•* *!« 

A = 

*21 

an 

• ' « 2 « 


_ &f}\ 

&nt • 



where u' — [a,\ a, 2 • • • a in ], i = I, 2,... ,n. For the rows (row vectors) to be lin¬ 

early independent, none must be a linear combination of the rest. More formally, as was 
mentioned in See. 4.3, linear row independence requires that the only set of scalars k, 
which can satisfy the vector equation 


be ki — 0 for all i. 



0 

X/l) 


(5.4) 


If the coefficient matrix is 



'3 

4 

5' 


h'l 

A = 

0 

6 

1 

8 

2 

10 

— 

1 

1_ 


then, since [6 8 10] = 2[3 4 5], we have V3 = 2v\ = 2v\ +- 0v 2 . Thus the third row is 
expressible as a linear combination of the first two, and the rows are not linearly indepen¬ 
dent. Alternatively, we may write the previous equation as 


2 v\ + 0 v 2 - V 3 = [6 8 10 ] + [0 0 0 ] - [6 8 10 ] = [0 0 0 ] 


Inasmuch as the set of scalars that led to the zero vector of (5.4) is not k, = 0 for all /, it 
follows that the rows are linearly dependent. 
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Unlike ihe squareness condition, the linear-independence condition cannot normally be 
ascertained at a glance. Thus a method of testing linear independence among rows (or 
columns) needs to be developed. Before we concern ourselves with that task, however, it 
would strengthen our motivation first to have an intuitive understanding of why the linear- 
independence condition is heaped together with the squareness condition at all. From the 
discussion of counting equations and unknowns in See. 3.4, we recall the general conclu¬ 
sion that, for a system of equations to possess a unique solution, it is not sufficient to have 
the same number of equations as unknowns. In addition, the equations must be consistent 
with and functionally independent (meaning, in the present context of linear systems, 
linearly independent) of one another. There is a fairly obvious tie-in between the “same 
number of equations as unknowns” criterion and the squareness (same number of rows and 
columns) of the coefficient matrix. What the "linear independence among the rows” 
requirement docs is to preclude the inconsistency and the linear dependence among the 
equations us well. Taken Together, therefore, the dual requirement of squareness and row 
independence in the coefficient matrix is tantamount to the conditions for the existence of 
a unique solution enunciated in Sec. 3.4. 

Let us illustrate how the linear dependence among ike row of the coefficient matrix can 
cause inconsistency or linear dependence among the equations themselves. Let the equa¬ 
tion system Ax = cl take the form 


'10 

4' 

*1 


V 

5 

2 



di 


where the coefficient matrix A contains linearly dependent rows: v\ — 2v' 2 . (Note that 
its columns are also dependent, the first being j of the second.) We have not specified 
the values of the constant terms d\ and d 2t but there are only two distinct possibilities 
regarding their relative values: (l) d\ = ld 2 and (2) d\ ± ld 2 . Under the first—with, say, 
d\ = 12 and d 2 = 6 the two equations arc consistent but linearly dependent (just as the 
two rows of matrix A arc), for the first equation is merely the second equation times 2. One 
equation is then redundant, and the system reduces in effect to a single equation, 
5jj 2 x 2 = 6 , with an infinite number of solutions. Tor the second possibility—with, say. 
d\ = 12 but cl 2 = 0 —the two equations arc inconsistent, because if the first equation 
(lOxi + 4 x 2 = 12 ) is true, then, by halving each term, we can deduce that 5xi + 2 x 2 — 6 ; 
consequently the second equation (5xi + 2 x 2 = 0 ) cannot possibly be true also. 1 bus no 
solution exists. 

The upshot is that no unique solution will be available (under either possibility) so long 
as the rows in the coefficient matrix A are linearly dependent. In fact, the only way to have 
a unique solution is to have linearly independent rows {or columns) in the coefficient 
matrix. In that case, matrix ,4 will be nonsingular, which means that the inverse A 1 docs 
exist, and that a unique solution x* = A~'d can be found. 

Rank of a Matrix 

Even though the concept of row independence has been discussed only with regard to square 
matrices, it is equally applicable to any m x n rectangular matrix. If Ihe maximum number 
of linearly independent rows that can be found in such a matrix is r, the matrix is said to be 
of rank r. (The rank also tells us the maximum number of linearly independent columns in 
the said matrix.) The rank of an m x n matrix can be at most m or n, whichever is smaller. 
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Given a matrix with only two rows (or two columns), row independence (or column 
independence) is easily verified by visual inspection—one only has to check whether one 
row (column) is the exact multiple of the other. But for a matrix of larger dimension, visual 
inspection may not be feasible, and a more formal method is needed. One method for find¬ 
ing the rank of a matrix A (not necessarily square), i.e., for determining the number of 
independent rows in A, involves transforming A into a so-called echelon matrix by using 
certain “elementary row operations.' - A particular structural feature of the echelon matrix 
will then tell us the rank of matrix A. 

There are only three types of elementary row operations on a matrix:’ 

1. Interchange of any two rows in the matrix. 

2. Multiplication (or division) of a row by any scalar k ^ 0. 

3. Addition of “k times any row" to another row. 

While each of these operations converts a given matrix A into a different form, none of 
them alters the rank. It is this characteristic of elementary row operations that enables us 
to read the rank of A from its echelon matrix. The easiest way to explain the method of 
echelon matrix is by a specific example. 


Example 5 


Find the rank of the matrix 


A = 


0 -11 -4 
2 6 2 
4 1 0 


from its echelon form. First, we check the first column of A for the presence of zero ele¬ 
ments. If there are zero elements in column 1, we move those zero elements to the bottom 
of the matrix. In the case of A, we want to move the 0 (first element of column 1) to the 
bottom of that column, which can be accomplished by interchanging row 1 and row 3 
(using the first elementary row operation). The result is 


*1 


4 1 0 

2 6 2 
0 -11 -4 


Our next objective is to reshape the first column of Ay into a unit vector ey as defined 
in (4.7). To transform the element 4 into unity, we divide row 1 of Ay by the scalar 4 
(applying the second elementary row operation), which yields 


A 2 


‘I 1 0“ 
2 6 2 
0 -11 -4 


Then, to transform the element 2 in column 1 of Aj into 0, we multiply row 1 of Ai by - 2, 
and then add the result to row 2 of 42 (applying the third elementary row operation). The 
resulting matrix. 




1 

0 

0 -11 


5 * 


0 

2 

-4 


f Similarly to elementary row operations, there can be defined elementary column operations. For our 
purposes, row operations are sufficient. 
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now has the desired unit vector ei as its first column. Having achieved this, we now exclude 
the first row of A 3 from further consideration, and continue to work only on the remaining 
two rows, where we want to create a two-element unit vector in the second column—by 
transforming the element 5^ into 1, and the element -11 into 0. To this end, we need to 
divide row 2 of Ai by 5thereby changing the row into the vector [0 1 and then 
add 11 times this vector to row 3 of ^ 3 - The end result, in the form of 


A 4 


"i i 0 ■ 
0 1 4 
.oco. 


exemplifies the echelon matrix, which, by definition, possesses three structural features. 
First, nonzero rows (rows with at least one nonzero element) appear above the zero rows 
(rows that contain only 0s). 5econd, in every nonzero row, the first nonzero element is 
unity. Third, the unit element (the first nonzero element) in any row must appear to the left 
of the counterpart unit element of the immediately following row. It should be clear by now 
that all the elementary row operations we have undertaken are designed to produce these 
features in A*. 

Now, we can simply read the rank of A from the number of nonzero rows present in the 
echelon matrix A t . Since A 4 contains two nonzero rows, we can conclude that r(A) = 2. 
This is, of course, also the rank of matrices A\ through A 4 , because elementary row opera¬ 
tions do not alter the rank of a matrix. 


The method of echelon matrix transformation applies to nonsquare as well as square 
matrices. Wc have chosen a square matrix for Example 5 because our immediate objective 
is to address the question of nonsingularity, which pertains only to square matrices. By 
definition, for an n x n matrix A to be nonsingular, it must have n linearly independent 
rows (or columns): consequently, it must be of rank «, and its echelon matrix must contain 
exactly n nonzero rows, with no zero rows at all. Conversely, an n x n matrix having rank 
n must be nonsingular. Thus an n x n echelon matrix with no zero rows must be nonsingu- 
lar, as is the matrix from which the echelon matrix is derived via elementary row opera¬ 
tions. In Example 5, the matrix A is 3 x 3, but r(A) = 2, hence..) is not nonsingular. 


EXERCISE 5.1 


1. In the following paired statements, let p be the first statement and q the second. 
Indicate for each case whether (5.1), (5.2), or (5.3) applies. 

( 0 ) It is a holiday; it is Thanksgiving Day. 

(b) A geometric figure has four sides; it is a rectangle. 

(c) Two ordered pairs (a, b) and (b, 0 ) are equal; a is equal to b. 

(d) A number is rational; it can be expressed as a ratio of two integers. 

(e) A 4 x 4 matrix is nonsingularfthe rank of the 4 x 4 matrix is 4. 

(f) The gasoline tank in my car is empty; 1 cannot start my car. 

( g ) The letter is returned to the sender with the marking "addressee unknown"; the 
sender wrote the wrong address on the envelope. 
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2. Let p be the statement "a geometric figure is a square/' and let q be as follows: 
(a) it has four sides. 

(£>) It has four equal sides. 

(0 It has four equal sides each perpendicular to the adjacent one. 

Which is true for each case: p => q, p *= q, or p q? 

3. Are the rows linearly independent in each of the following? 


24 8’ 

9 -3 

(b) 

'2 O' 
[0 2. 

(c) ! 

<SI 

O m 

i {d) 

'-1 5' 

2 -10 


4. Check whether the columns of each matrix in Prob. 3 are also linearly independent. Do 
you get the same answer as for row independence? 

5. Find the rank of each of the following matrices from its echelon matrix, and comment 
on the question of nonsingularity. 


i 

1 

5 

1 " 

1 

"7 

6 

3 

3 

(c)A = 

0 

3 

9 

(c}C = 

0 

1 

2 

1 


.-1 

0 

0 


_8 

0 

0 

8. 


'0 

-1 

-4 


'2 

7 

9 

-1 

(b)B = 

3 

1 

2 

(d)D = 

1 

1 

0 

1 


.6 

1 

0 


0 

5 

9 

-3 


6. By definition of linear dependence among rows of a matrix, one or more rows can be 
expressed as a linear combination of some other rows, in the echelon matrix, linear 
dependence is signified by the presence of one or more zero rows. What provides the 
link between the presence of a linear combination of rows in a given matrix and the 
presence of zero rows in the echelon matrix? 


5,2 Test of Nonsingularity by Use of Determinant _ 

To ascertain whether a square matrix is nomingular, wc can also make use of the concept 
of determinant. 

Determinants and Nonsingularity 

The determinant of a square matrix A, denoted by |.4|, is a uniquely defined scalar (num¬ 
ber) associated with that matrix. Determinants arc defined only for square matrices. The 
smallest possible matrix is, of course, the I x 1 matrix A = [«n]. By definition, its deter¬ 
minant is equal to the single element «n itself: [A\ = |un( = u\\. The symbol |«nl hero 
must not be confused with the look-alike symbol for the absolute value of a number. In the 
absolute-value context, we have, for instance, not only |5| = 5, but also | - 5| = 5, because 
the absolute value of a number is its numerical value without regard to the algebraic sign. 
In contrast, the determinant symbol preserves the sign of the element, so while |8 = 8 
(a positive number), we have | — 8| = —8 (a negative number). This distinction proves to 
be crucial in the later discussion when we apply determinantal tests whose results depend 
critically on the signs of determinants of various dimensions, including 1 x 1 ones, such as 

lonl =a u . 
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Example 1 


For a 2 x 2 matrix A 
terms as follows: 


On 0] 2 
021 0 2 2 


its determinant is defined to be the sum of two 


Ml 


ii{{ 

*12 

*21 

*22 


= O11O22 - tt 7 \Q\l 


[= a scalar] 


( 5 . 5 ) 


which is obtained by multiplying the two elements in the principal diagonal of A and then 
subtracting the product of the two remaining elements. In view' of the dimension of matrix 
A. the determinant \A \ given in (5.5) is called a second-order determinant. 


Given A = 


10 4 
8 5 


and 


and B = 


I4| = 
181 = 


3 5 

0 - 1 

10 4 
8 5 

3 5 
0 -1 


their determinants are 


10(5)-8(4) = 18 


3(-1) - 0(5) = -3 


While a determinant (enclosed by two vertical bars rather than brackets) is by definition a 
scalar, a matrix as such does not have a numerical value. In other words, a determinant is 
reducible to a number, but a matrix is, in contrast, a whole block of numbers. It should also 
be emphasized that a determinant is defined only for a square matrix, whereas a matrix as 
Such does not have to be square. 


Even at this early stage of discussion, it is possible to have an inkling of the relationship 
between the linear dependence of the rows in a matrix A, on the one hand, and its determi¬ 
nant | A |, on the other. The two matrices 


1 

C 1 


r ^ s 



' d\ 1 r 

y 6l 

i 

/ 2 . 

— 

.3 8 _ 

and 

D = 


i 24 _ 


C = 


both have linearly dependent rows, because q = M and cl^ — 4d[. Both of their determi¬ 
nants also turn out to be equal to zero: 

3 8 


|C| = 

\D\ = 


3 8 


8 24 


= 3(8)- 3(8) = 0 


= 2(24) - 8(6) = 0 


This result strongly suggests that a “vanishing” determinant (a zero-value determinant) 
may have something to do with linear dependence. Wc shall see that this is indeed the case. 
Furthermore, the value of a determinant |.4| can serve not only as a criterion for testing the 
linear independence of the rows (hence the nonsingularity) of matrix A, but also as an input 
in the calculation of the inverse A~ ] . if it exists. 


First, however, we must widen our vista by a discussion of higher-order determinants. 


Evaluating a Third-Order Determinant 

A determinant of order 3 is associated with a 3 x 3 matrix. Given 


*11 

*12 

*13 

*21 

*22 

*23 

*31 

*32 

*33 
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FIGURE 5.1 


Example 2 


Example 3 



its determinant lias the value 


«11 

*12 

013 








= 011 

022 02 3 

-012 

021 

023 

+ 013 

021 022 

«21 

an 

023 

0^2 033 


031 

033 


031 032 

«3l 

an 

033 









= a 1 1^22^33 — dl 1^23^32 + 0,2(22.303 1 _ 012 ^ 21 0?3 
+ 013021032 - «i 3 fl 22 « 3 i [= a scalar] ( 5 . 6 ) 

Looking first at the lower line of (5.6). wc see the value of \A\ expressed as a sum of six 
product terms, three of which are prefixed by minus signs and three by plus signs. 
Complicated as this sum may appear, there is nonetheless a very easy way of “catching" 
all these six terms from a given third-order determinant. This is best explained diagram- 
matically (Fig, 5.1). In the determinant shown in Fig. 5.1, each element in the top row 
has been linked with two other elements via two solid arrows as follows: an ^22 033 . 

fl l2 a 2 ) -* « 3 i, and tv 13 -> Qyi —* o?, • Each triplet of elements so linked can be multi¬ 
plied out, and their product taken as one of the six product terms in (5.6). The solid-arrow 
product terms are to be prefixed with plus signs. 

On the other hand, each top-row element has also been connected with two other ele¬ 
ments via two broken arrows as follows: an a^_ -*■ a-ii, an «:i and 

an ->■ n 2 2 -*■ a 3 i. Bach triplet of elements so connected can also be multiplied out, and 
their product taken as one of the six terms in (5.6). Such products are prefixed by minus 
signs. The sum of all the six products will then be the value of the determinant. 


2 1 3 
4 5 6 
7 8 9 


(2)(5)(9) + (1 )(6)(7) + (3)(8)(4) - (2)(8)(6) - (1 )(4)(9) - (3)(5)(7) = -9 


-7 0 3 
9 1 4 
0 6 5 


= (-7)(1 )<S) + (0)(4)(0) + (3)(6)(9) - (-7)(6)(4) - (0)(9)(5) - (3)(1)(0) 


= 295 
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Example 4 


i 


This method of cross-diagonal multiplication provides a handy way of evaluating a third- 
order determinant, but unfortunately it is nut applicable to determinants of orders higher 
than 3. For the latter, we must resort to the so-called Laplace expansion of the determinant. 


Evaluating an nth-Order Determinant by Laplace Expansion 


Let us first explain the Laplace-expansion process for a third-order determinant. Returning 
to the first line of (5.6). we see that the value of \A \ can also be regarded as a sum o l' three 
terms, each of which is a product of a first-row element and a particular second-ot6et 
determinant. This latter process of evaluating \A\ —by means of certain lower-order 
determinants—illustrates the Laplace expansion of the determinant. 

The three second-order determinants in (5.6) are not arbitrarily determined, but are 


specified by means of a definite rule. The first one, 


an 
Hi 2 


(in 


, is as'idideterminant of[/l| 


obtained by deleting the first row and first column of \A\. This is called the minor of the 
element a, i (the element at the intersection of the deleted row and column) and is denoted 
by |Mn |, In general, the symbol \M,j\ can be used to represent the minor obtained by delet¬ 
ing the rth row and jth column of a given determinant. Since a minor is itself a determinant, 
it has a value. As the reader can verify, the other two second-order determinants in (5.6) arc, 
respectively, the minors |Af !2 | and |A/j 3 1; that is, 


\M U 


H22 &23 

«?2 £133 


|A/| 2 | = 


"21 

"23 

"31 

"33 


I A /,3 


"21 

"22 

"31 

"32 


A concept closely related to the minor is that of the cofactor. A eofactor, denoted by 
|Cj/|. is a minor with a prescribed algebraic sign attached to it. f The rule of sign is as fol¬ 
lows. If the sum of the two subscripts f and / in the minor | M,j | is even, then the cofactor 
takes the same sign as the minor; that is, |C (/ | = \M,f. If it is odd, then the cofactor takes 
the opposite sign to the minor; that is, | C tJ \ = In short, wc have 


\C t j\ = (-1)' +/ |Myl 


where it is obvious that the expression (-l)'^ can be positive if and only if (/ + /) is even. 
The fact that a eofactor has a specific sign is of extreme importance and should always be 
borne in mind. 


In the determinant 


9 8 7 
6 5 4 
3 2 1 


the minor of the element 8 is 


|Ml2 


6 4 






3 1 


but the cofactor of the same element is 


Ci 2 I=-|M I2 | = 6 


because / + j = 1 + 2 = 3 is odd. Similarly, the cofactor of the element 4 is 


IC23 


M 23 | = 


9 8 




3 2 


f Many writers use the symbols M,,and C, ( (without the vertical bars) for minors and cofactors. We 
add the vertical bars to give visual emphasis to the fact that minors and cofactors are in the nature of 
determinants and, as such, have scalar values. 
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Using these new concepts, we can express a third-order determinant as 
Ml = an IA/n I - ai2|Wi2l +«n|A/|3l 

3 

= £J|1 |C]1 I +U]2}^12\ +0]3|Cu| = £ fl „|C, y | (5.7) 

3 = 1 

i.e.,as a sum of three terms, each of which is the product of a first-row element and its cor¬ 
responding cofactor. Note the difference in the signs of the unl^M and a 12 IG 2 I terns in 
(5.7). This is because 1 -I- 2 gives an odd number. 

The Laplace expansion of a third-order determinant serves to reduce the evaluation 
problem to one of evaluating only certain swW-order determinants, A similar reduction 
is achieved in the Laplace expansion of higher-order determinants. In a fourth-order deter¬ 
minant | F|, for instance, the top row will contain four elements, An ... A[ 4 ; thus, in the 
spirit of <5.7), we may write 

4 

| 2 ?| = £/) I; |C 1/ | 
i=\ 

where the cofactors |C (/ | are of order 3. Each third-order cofactor can then be evaluated as 
in (5.6). In general, the Laplace expansion of an nth-order determinant will reduce the 
problem to one of evaluating n cofactors, each of which is of the (n — 1 )st order, and the 
repeated application of the process will methodically lead to lower and lower orders of 
determinants, eventually culminating in the basic second-order determinants as defined 
in (5.5). Then the value of the original determinant can be easily calculated. 

Although the process of Laplace expansion has been couched in terms of the cofactors 
of the first-row elements, it is also feasible to expand a determinant by the cofactor of any 
row or, for that matter, of any column. For instance, if the first column of a third-order 
determinant Ml consists of the elements a,,, u ;i , and a n , expansion by the cofactors of 
these elements will also yield the value of \A |: 

3 

Ml = «nK']il + a 2 \IQil + tfjiK’jii - y~M,i|Cn| 


Example 5 


5 6 1 

Civen | A| = \2 3 0 

7-3 0 

\A\ = 5 


, expansion by the first row produces the result 


3 0 
-3 0 


-6 


2 0 
7 0 


2 3 
7 -3 


= 0 + 0-27= -27 


But expansion by the first column yields the identical answer: 


A\=5 


3 0 
-3 0 


-2 


6 1 
-3 0 


+ 7 


6 1 
3 0 


= 0-6-21 =-27 


Insofar as numerical calculation is concerned, this fact affords us an opportunity to 
choose some "easy” row' or column for expansion. A row or column with the largest num¬ 
ber of Os or Is is always preferable for this purpose, because a 0 times its cofactor is simply 
0, so that the term will drop out. and a 1 times its cofactor is simply the cofactor itself, so 
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that at least otic multiplication step can be saved. In Example 5, the easiest way to expand 
the determinant is by the third column, which consists of the elements 1, 0, and 0. We 
could therefore have evaluated it thus: 




21 = -27 


To sum up, the value of a determinant i.4| of order n can be found by the Laplace ex¬ 
pansion of any row or any column as follows: 

II 

Ml =£>y|C, , | [expansion by the /th row] 

;=i 

n 

= £ fll/ |C 0 | [expansion by the /th column] (5.8) 
i-1 


EXERCISE 5.2 


1. Evaluate the following determinants: 



s 

1 

3 


4 

0 2 



0 

b c 

( 0 ) 

4 

0 

1 


6 

0 3 


(e) 

b 

c a 


6 

0 

3 


8 

2 3 



c 

a b 


1 

2 

3 


1 

1 

4 


X 

5 0 


4 

7 

5 

00 

8 

11 

-2 

(0 

3 

Y 2 


3 

6 

9 


0 

4 

7 


9 

-1 8 


2. Determine the signs to be attached to the relevant minors in order to get the following 
cofactors of a determinant: |Ci 31 , IC 231 , |C 331 , jC^ |, and ]C 341 . 


3. Given 


0 b c 

d e f , find the minors and cofactors of the elements a, b, and f. 
g h i j 


4. Evaluate the following determinants: 


1 

2 

0 

9 


l 

7 

0 

1 

2 

3 

4 

6 

(b) 

5 

6 

4 

8 

1 

6 

0 

-1 

0 

0 

9 

0 

0 

-5 

0 

8 


1 

-3 

1 

4 


5. In the first determinant of Prob. 4, find the value of the cofactor of the element 9. 


6 . Find the minors and cofactors of the third row, given 


4 = 


9 11 4 
3 2 7 

6 10 4 


7. Use Laplace expansion to find die determinant of 


A = 


15 7 9 

2 5 6 

9 0 12 
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5.3 Basic Properties of Determinants _ 

We can now discuss some properties of determinants which will enable us to ‘‘discover” the 
connection between linear dependence among the rows of a square matrix and the vanish¬ 
ing of the determinant of that matrix. 

Five basic properties will be discussed here. These arc properties common to determi¬ 
nants of all orders, although wc shall illustrate mostly with second-order determinants: 

Property I The interchange of row's and columns does not affect the value of a determi¬ 
nant. In other words, the determinant of a matrix A has the same value as that of its 
transpose A\ that is, \A\ = \A'\. 


Example 1 


Example 2 



4 

3 

14 

5 


5 

6 

i 3 

6 

a 

b 


: a c 


c 

d 


\b d 

i 



= 9 


ad - be 


Property II The interchange of any m-o rows (or any rwo columns) will alter the sign, but 
not the numerical value, of the determinant. (This properly is obviously related to the first 
elementary row operation on a matrix.) 


Example S 


'a 


b 

d 


= od - be, but the interchange of the two rows yields 


c d 
a b 


cb- ad= -(ad - be) 


Example 4 


o 1 

2 5 

3 0 

3 1 
7 5 
1 0 


3 

7 


26, but the interchange of the first and third columns yields 


0 

2 

3 


- 26. 


Property III The multiplication of any one row (or one column) by a scalar k will change 
the value of the determinant fr-fold. (This property is related to the second elementary row- 
operation on a matrix.) 


Example 5 


By multiplying the top row of the determinant in Example 3 by k, we get 


:ko kb 
i c d 


= kad -kbe = k(ad -bc) = k 


b 

d 


It is important to distinguish between the two expressions kA and k\A\. in multiplying a 
matrix A by a scalar k. all the elements in A are to be multiplied by k. But, if wc read the 
equation in the present example from right to left, it should be clear that, in multiplying 
a determinant \A\ by k, only a single row (or column) should be multiplied by k. This 
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equation, therefore, in effect gives us a rule for factoring a determinant: whenever any sin¬ 
gle row or column contains a common divisor, it may be factored out of the determinant. 


Example 6 


Factoring the first column and the second row in turn, we have 


15 a 7b 
12 c Id 


5 a 7b 
Ac 2d 


= 3(2) 


So 7b 
2C d 


= 6(5ad-]Abc) 


The direct evaluation of the original determinant will, of course, produce the same answer. 


In contrast, the factoring of a maim requires the presence of a common divisor for all 
its elements, as in 


'ka 

kb 

= k 

[a b 

kc 

kd 

' m c <l m 


Property IV The addition (subtraction) of a multiple of any row to (from) another row will 
leave the value of the determinant unaltered The same holds true if we replace the word 
row by column in the previous statement. (This property is related to the third elementary 
row operation on a matrix.) 


Example 7 


Adding k times the top row of the determinant in Example 3 to its second row, we end up 
with the original determinant; 


o b 
c + ka d+kb 


a(d + kb) - b(c + ka) = ad - be 


a b 
c d 


Property'V If one row (or column) is a multiple of another row (or column), the value of 
the determinant will be zero. As a special case of this, when two rows (or two columns) are 
identical, the determinant will vanish. 


Example 8 


2 c 2b 
a b 


2ab - 2ab = 0 


c c 
d d 


= cd-cd = 0 


Additional examples of this type of "vanishing" determinant can be found in Exercise 5.2-1. 


This important property is, in fact, a logical consequence of Properly IV To understand 
this, let us apply Property IV to the two determinants in Example 8 and watch the outcome. 
For the first one, try to subtract twice the second row from the top row; for the second 
determinant, subtract the second column from the first column. Since these operations do 
not alter the values of the determinants, we can write 


2 a 

2b 


0 

0 


c c 


0 c 

a 

b 


a 

b 


cl cl 


0 d 


The new (reduced) determinants now contain, respectively, a row and a column of zeros; 
thus their Laplace expansion must yield a value of zero in both cases. In general, when one 
row (column) is a multiple of another row (column), the application of Property IV can al¬ 
ways reduce all elements of that row (column) to zero, and Property V therefore follows. 

The basic properties just discussed are useful in several ways. For one thing, they can be 
of great help in simplifying the task of evaluating determinants. By subtracting multiples 
of one row (or column) from another, for instance, the elements of the determinant may be 
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reduced to much smaller and simpler numbers. Factoring, if feasible, can also accomplish 
the same. If we can indeed apply these properties to transform some row or column into a 
form containing mostly Os or 1 s, Laplace expansion of the determinant will become a much 
more manageable task. 

Determinantal Criterion for Nonsingularity 

Our present concern, however, is primarily to link the linear dependence of rows with the 
vanishing of a determinant. For this purpose, Property V can be invoked. Consider an equa¬ 
tion system Ax — d: 


1 

<N 

1 _ 


X\ 


~d x ~ 

15 20 10 


Xi 

— 

di 

1 - 

O 

1_ 






This system can have a unique solution if and only if the rows in the coefficient matrix A 
are linearly independent, so tha tA is nonsingular. But the second row is live times the first; 
the rows are indeed dependent, and hence no unique solution exists. The detection of this 
row dependence was by visual inspection, but by virtue of Property V we could also have 
discovered it through the fact that \A\ =0. 

The row dependence in a matrix may, of course, assume a more intricate and secretive 
pattern. For instance, in the matrix 


"4 

1 

2 " 



5 

2 

1 

= 

v' 2 

_] 

0 

1 _ 


A. 


there exists row dependence because 2v\ - v'-, - 3t>j = 0; yet this fact defies visual detec¬ 
tion. Even in this case, however, Property V will give us a vanishing determinant, \B\ = 0, 
since by adding three times to v 2 and subtracting twice uj from it, the second row can be 
reduced to a zero vector. In general, any pattern of linear dependence among rows will be 
reflected in a vanishing determinant—and herein lies the beauty of Property V! Conversely, 
if the rows are linearly independent, the determinant must have a nonzero value. 

We have, in the previous two paragraphs, tied the nonsingularity of a matrix principally 
to the linear independence among raws. But, on occasion, we have made the claim that, for 
a square matrix A, row' independence <$- column independence. We arc now equipped to 
prove that claim: 

According lo Property I, wc know that \A\ - \A'\. Since row independence in A \A\ -± 0, 
we may also state that row independence in A ■?> |/l'| yi 0. But \A'\ ri 0 ■» row indepen¬ 
dence in the transpose A' column independence in A (Vows of A' arc by definition the 
columns ofd). Therefore, row independence in A e? column independence in A. 

Our discussion of the test of nonsingularily can now be summarized. Given a linear- 
equation system Ax = d , where A is an n x n coefficient matrix. 

\A\ ^0 O there is row (column) independence in matrix A 
4? A isnonsingular 
>$■ A~ 1 exists 

a unique solution x* - A~ [ d exists 
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Thus the value of the determinant of the coefficient matrix, |^4|, provides a convenient cri¬ 
terion for testing the nonsingularity of matrix A and the existence of a unique solution to 
the equation system Ax = ci. Note, however, that the determinantal criterion says nothing 
about the algebraic signs of the solution values; i.e., even though we are assured of a unique 
solution when |/1[ ^ 0, we may sometimes get negative solution values that arc economi¬ 
cally inadmissible. 


Example 9 


Does the equation system 


7x i - 3x2 - 3x3 = 7 
2xi +4 x 2 + X 3 = 0 

-2x2 — X3 = 2 


possess a unique solution? The determinant \A\ is 


7 -3 -31 
2 4 1 
0 -2 1 


8#0 


Therefore a unique solution does exist, 


Rank of a Matrix Redefined 

The rank of a matrix A was earlier defined to be the maximum number of linearly indepen¬ 
dent rows in A. In view of the link between row independence and the nonvanishing of the 
determinant, wc can redefine the rank of an m x n matrix as the maximum order of a non- 
vanishing determinant that can be constructed from the rows and columns of that matrix. 
The rank of any matrix is a unique number. 

Obviously, the rank can at most be tn or n, whichever is smaller, because a determinant 
is defined only for a square matrix, and from a matrix of dimension, say. 3x5, the largest 
possible determinants (vanishing or not) will be of order 3. Symbolically, this fact may be 
expressed as follows: 

r(A) < min \m, n\ 

which is read: “The rank of A is less than or equal to the minimum of the set of two num¬ 
bers m and n." The rank of an nxn nonsingular matrix A must be n ; in that case, we may 
write r(A) = n. 

Sometimes, one may be interested in the rank of the product of two matrices. In that 
case, the following rule can be of use: 


r(AB) < min \r(A), x(fl)) (5.9) 

While this rule does not yield a unique value of r(AB), die application of the rule can nev¬ 
ertheless lead to unique results. In particular, we can use (5.9) to show that if a matrix A, 
with r(A | = j, is multiplied by any (conformable) nonsingular matrix B, the rank of the 
product matrix A B (ot BA, as the case maybe), must be j. We shall prove this for the prod¬ 
uct AB (the case of BA is analogous). First, looking at the right-hand side of (5.9). we see 
only three possible cases: (i) r(A) < r(B ), <ii) r(A) = r{H), and (iii) r{A) > r{B). 
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For cases (i) and (li), (5.9) reduces directly to r(AB) < r(A) = j. For case (iii), we find 
thatr(dS) < r(S) < r{A) = ./. Thus, either way. we get 

r{AB) < r(A) = j (5-10) 

Now consider the identity = A. By (5.9), we can write 

r[(AB)B~'] < min{/-(/}5).r(5 -1 )} 

Applying the same reasoning that led us to (5.10), we can conclude from this that 

r[(AB)B~ l ]<r(AB) 

Since the left-side expression of this inequality is equal to r{A) = /', we may write 

j <r(AB) (5.11) 

But (5.10) and (5.11) cannot be satisfied simultaneously unless r(AB) = j. (hus the rank 
of the product matrix AB must bey, as asserted. 


EXERCISE 5.3 


1. Use the determinant 


4 0-1 

2 1 -7 

3 3 9 


to verify the first four properties of determinants. 


2. Show that, when all the elements of an mh-order determinant \A\ are multiplied by a 
number k, the result will be k n \A\. 


3. Which properties of determinants enable us to write the following? 


(o) 


9 

18 


9 18 

27 

56 


0 2 


(b) 


9 27 
4 2 


= 18 


1 3 

2 1 


4. Test whether the following matrices are nonsingular: 



“ 4 

0 

r 


r 7 

-1 


o“ 

( o ) 

19 

1 - 

-3 

(c) 

i 

1 


4 


7 

1 

0 


13 

-3 

— 

■4_ 


4 

-2 

r 


'-4 

9 

5 * 


( b ) 

-5 

6 

0 

«0 

3 

0 

1 



7 

0 

3. 


10 

8 

6 



5. What can you conclude about the rank of each matrix in Prob. 4? 

6. Can any of the given sets of 3-vectors below span the 3-space? Why or why not? 


(o) [1 2 1] [2 3 1] [3 4 2] 

(fa) [8 13] [12 8] [-7 1 5] 

7. Rewrite the simple national-income model (3.23) in the Ax = d format (with f as 
the first variable in the vector x), and then test whether the coefficient matrix A is 
nonsingular, 

8. Comment on the validity of the following statements: 

(a) "Given any matrix A, we can always derive from it a transpose, and a determinant." 
(fa) "Multiplying each element of annx/i determinant by 2 will double the value of 
that determinant." 


(c) "If a square matrix A vanishes, then we can be sure that the equation system 
Ax = d is nonsingular.” 
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5.4 Findina the Inverse Matrix 


If the matrix A in the linear-equation system Ax = d is nonsingular, then A 1 exists, and 
the solution of the system will be x* = A~'d. We have learned to test the nonsingularity of 
A by the eriterion \A \ / Cl. The next question is, How ean we find the inverse A~' i i'A does 
pass that test? 

Expansion of a Determinant by Alien Cofactors 

Before answering this query, let us discuss another important property of determinants. 

Property VI The expansion of a determinant by alien cefaclors (the cofactors of a 
“wrong" row or column) always yields a value of zero. 

4 1 2 

If we expand the determinant 5 2 1 by using its first-row elements but the cofactors 

1 0 3 

of the second- row elements 

12 42 411 

0 3=- 3 l C “l=13 =1 ° |C ^-1oH 

we get tJ-[, |C21 1 + a, 2 1 C22 1 -!- Oi3 1 C23 1 = 4(~3) l- 1(10) -f 2(1) = 0. 

More generally, applying the same type of expansion by alien cofactors as described in 

an 0)3 

Example l to the determinant 1.41 = °2\ 0:2 023 will yield a zero sum of products as 

0.31 0.32 033 

follows: 

3 

^2 a b\ c ij 

1=1 


The reason for this outcome lies in the fact that the sum of products in (0.12) can be con¬ 
sidered as the result of the regular expansion by the second row of another determinant 
flu flu flu 

\A *j = a\\ a 12 flu , which differs from \A\ only in its second row' and whose first 

031 032 A3 3 

two rows are identical. As an exercise, write out the cofactors ol'lhe second rows of |4 l | 
and verify that these are precisely Ihe cofaetors which appeared in (5.12) and with the 
correct signs. Since \A*\ = 0, because of its two identical rows, the expansion by alien 
cofaetors shown in (5.12) will of necessity yield a value of zero also. 


= 011IC21I +012IC22I + 013 IQ 


012 «13 , 011 013 

■011 +012 

032 033 031 033 


«11 012 
0.31 032 


(5.12) 


'01I«I2033 + 011013032 + 0 U0120.33 - 0120130.31 
0J1013032 + 012013 031 = 0 



Example 1 
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Property VI is valid for determinants of all orders and applies when a determinant is 
expanded by the alien co factors of any row or any column. Thus we may state, in general, 
that for a determinant of order n the following holds: 


J^«y|C<7l = 0 (' #0 

/=' 

>i 

2^ 1C,, 1 = 0 (;'#/) 

1=1 


[expansion by ith row and 
cofactors of/'th row] 

(5.13) 

[expansion byyth column and 
cofactors of y'th column] 


Carefully compare (5.13) with (5.S). In the latter (regular Laplace expansion), the sub¬ 
scripts of ajj and of | C tJ | must be identical in each product term in the sum. In the expan¬ 
sion by alien cofactors, such as in (5.13), on the other hand, one of the two subscripts (a 
chosen value of i' or f) is inevitably “out of place." 


Matrix Inversion 

Property VI, as summarized in (5.13), is of direct help in developing a method of matrix 
inversion, i.e., of finding the inverse of a matrix. 

Assume that an n x n nonsingular matrix A is given: 


*11 

a n ‘ 

•* 0|« ; 



021 

^22 

• ’ «2« 

(Ml^o) 

(5.14) 

0«1 

0*2 * 

• * 0«« _ 




Since each element of A has a cofactor |C,,j, it is possible to form a matrix of cofactors by 
replacing each element a u in (5.14) with its cofactor K)y|. Such a cofaclor matrix, denoted 
by C = [|Cy|], must also be n x n. For our present purposes, however, the transpose of C 
is of more interest. This transpose C is commonly referred to as the adjoint of A and is 
symbolized by adj A. Written out, the adjoint takes the form 


C = adj A = 

(nxn) 


|C,,I |C:,| ••• K„il 

|C 12 | |C 22 | ... |C„ 2 | 


(5.15) 


Lie,,,! ic 2 „i ic*„|j 

The matrices A and C are conformable for multiplication, and their product AC is 
another nxn matrix in which each element is a sum of products. By utilizing the formula 
for Laplace expansion as well as Property VI of determinants, the product AC may be 
expressed as follows: 


r n « » 



E fl| / 

r=\ 

n 

\ c u \ 

icjji 

M 

n 

/=> 

n 

AC' = 


|C,y| 

£*2/|C2/l 

••• 

/ = ! 


j=i 

y=i 


n 


n 

n 



|C|,| 


••• E^' |C T 


. /=i 


./ = ' 

>=' 
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Ml 0 ••• 0 

0 Ml 0 


[by (5.8) and (5.13)] 


0 0 

'1 0 
0 1 

.0 0 


MlJ 


0 

0 

1 


= \A\r„ 


[factoring] 


As the determinant Ml is a nonzero scalar, it is permissible to divide both sides of the 
equation AC = |,4|/ by M|. The result is 


AC 


A | 


= 1 


or 



= / 


Premultiplying both sides of the last equation by A ~\and using the result that A~ l A = /. 

C 

we can get — = or 

Ml 


A-' 


Ml 


adj A 


[by (5.15)] 


(5.16) 


Now, we have found a way to invert the matrix A ! 

The general procedure for finding the inverse of a square matrix A thus involves the fol¬ 
lowing steps: (1) find Ml [we need to proceed with the subsequent steps if and only if 
\A | 0, for if M| = 0. the inverse in (5.16) will be undefined]; (2) find the cofactors of all 

the elements of A, and arrange them as a matrix C = [|C,, |]; (3) take the transpose of C to 
get adj A: and (4) divide adj A by the determinant Ml-Thc result will be the desired inverse 
A-'. 


Example 2 


Find the inverse of A = 


3 2 

1 0 


. Since |A| = —2 ^ 0, the inverse A 1 exists. The cofactor 


of each element is in this case a 1 x 1 determinant, which is simply defined as the scalar 
element of that determinant itself {that is, |o, ( | = o (( ). Thus, we have 


'ICiiI 

IC 12 I' 


0 -T 

JC 21 I 

IC 22 .'. 


-2 3 


Observe the minus signs attached to 1 and 2, as required for cofactors. Transposing the 
cofactor matrix yields 


adj A 


0 -2 
-1 3 


so the inverse A 1 can be written as 


4" 1 


Ri aclM = 


1 

0 

-2' 


'0 

1 ‘ 

2 

-1 

3 

l 

1 

. 2 

3 

2 . 
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Example 3 


Find the inverse of B =-■ 
cofactor matrix is 


4 1 
0 3 
3 0 


-1 

2 

7 


. Si nee | B | = 99 ^ 0, the inverse 8 


also exists. The 



3 

2 


0 

2 

0 

3 


0 

7 


3 

7\ 

3 

0 


1 

-1 


4 

-1 


4 

1 


0 

7 


3 

7 


3 

0 


1 

-1 


4 

-1 


4 

1 


3 

2 


0 

2 


0 

3 


21 6 -9 

-7 31 3 

5-812 


Therefore, 


adj B = 


21 -7 5 

6 31 -8 
-9 3 12 


and the desired inverse matrix is 


B" 1 


1 

6 


adj B 


1 

99 


21 -7 5 

6 31 -8 
-9 3 12 


You can check that the results in Examples 2 and 3 do satisfy AA"' - A ''A = / and 


RB~ ] = B ] B = /, respectively. 


EXERCISE 5.4 


1. Suppose that we expand a fourth-order determinant by its third column and the cofac¬ 
tors of the second-column elements. How would you write the resulting sum of prod¬ 
ucts in Y. notation? What will be the sum of products in j£ notation if we expand it by 
the second row and the cofactors of the fourth-row elements? 


2 . 


Find the inverse of each of the following matrices; 


(a)A = 


5 2 
0 1 


(b)B = 


-1 0 
9 2 


(0 c = 


3 7* 

3 -1 


(d) 0 = 


7 6 
0 3 


3. (a) Drawing on your answers to Prob. 2, formulate a two-step rule for finding the ad¬ 
joint of a given 2 x 2 matrix A: In the first step, indicate what should be done to the 
two diagonal elements of A in order to get the diagonal elements of adj A; in the 
second step, indicate what should be done to the two off-diagonal elements of A. 
( Warning : This rule applies only to 2 x 2 matrices.) 

(. b ) Add a third step which, in conjunction with the previous two steps, yields the 2 x 2 
inverse matrix A~\ 


4. Find the inverse of each of the following matrices: 



'4 

-2 

1 


"1 

0 

o" 

(a) E = 

7 

3 

0 

£c) G = 

0 

0 

1 


_2 

0 

1 _ 


_0 

1 

0_ 


'1 

-1 

2" 


'1 

0 

o" 

(6) F = 

1 

0 

3 

(d)H = 

0 


0 


4 

0 

2 


0 

0 

1 
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5. Find the inverse of 


4 = 


A 1 -5 
-2 3 1 

3 -1 4 


6 . Solve the system Ax = d by matrix inversion, where 

(a) 4x + ‘iy = 2& (b) 4*1 + *2 -5*2 = 8 

2jr + 5y= 42 -2 xi+3x 2 + *3 = 12 

3*i - *2 + 4 x 3 ~ 5 

7. Is it possible for a matrix to be its own inverse? 


5.5 Cramer's Rule 


The method of matrix inversion discussed in Sec. 5.4 enables us to derive a practical, if not 
always efficient, way of solving a linear-equation system, known as Cramer’s rale. 


Derivation of the Rule 

Given an equation system Ax = d, where A is n x n. the solution can be written as 

x* = A M = — (adj A)d [by (5.16)] 

Ml 

provided A is nonsiugular. According to (5.15), this means that 


x 

x 


t 


1 


X 




i 

M 


1 

Ml 


|C n | 

|C 2 ,1 •• 

■ IC,„| 


d\ 

|C, 2 | 

ICM ■- 

■ |C„ 2 | 


d 2 

|G|„| 

\C\\ ■■ 

■ |C„„| 


_d„_ 

4IQi 

\+d 2 \C 2i \ 

+ —l-<?„|C„i 

r/||C l2 

\+d 2 \C 22 \ 

+ ■ ■ ■ A-d„\C„2\ 

d\ |C, fl 

+ d2\C2,i\ 

+ '••+- cl n 

1C 1 

| '-/M I 


5>c (] i 

/=i 



£>l c in 

i —I 
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Example 1 


Equating the corresponding elements on the two sides of the equation, we obtain the solu¬ 
tion values 


< = = <*■> < 5 - 17 > 

The Y1 terms in (5.17) look unfamiliar. What do they mean? From (5.8), we see that the 
Laplace expansion of a determinant |.4| by its first column can be expressed in the form 

n 

2>ilC (] |. If we replace the first column of Ml by the column vector d but keep all the 

other columns intact, then anew determinant will result, which wo can call Mil -the sub¬ 
script 1 indicating that the first column has been replaced by d.Thc expansion ol'Mil by its 

n 

lirst column (the d column) will yield the expression ^ </; |C) 1 1, because the elements d\ 

now take the place of the elements Reluming to (5.17), we see therefore that 

1 

-Vi = —1,4x1 

1 Ml 

Similarly, if we replace the second column of \A \ by the column vector </, while retaining 

all the other columns, the expansion of the new determinant M 2 | by its second column (the 

_)) 

(/column) will result in the expression yy,-|Q;|. When divided by Ml, this latter sum 


will give us the solution value Xj , and so on. 

This procedure can now be generalized. To find the solution value of the /th variable xj, 
we can merely replace theyth column of the determinant \A \ by the constant terms t/j • • • d„ 
to get a new determinant \Aj \ and then divide 4, by the original determinant \A\. Thus, 
the solution of the system Ax = d can be expressed as 




On 

« I 2 

• • • d\ - • 

' < 2 1/7 

,_M;I 

1 

«21 

*22 

... di 

■ <*2n 

1 \A\ 

"mi 


a«2 

• • • d„ •• 

* 


(5.18) 


(/th column replaced by d) 


The result in (5.18) is the statement of Cramer’s rule. Note that, whereas the matrix inver¬ 
sion method yields the solution values of all the endogenous variables at once (x* is a vec¬ 
tor), Cramer’s rule can give us the solution value of only a single endogenous variable at a 
time (xj is a scalar); this is why it may not be efficient. 


Find the solution of the equation system 


5xi + 3x2 = 30 
6 x 1 - 2 x 2 = 8 


The coefficients and the constant terms give the following determinants: 


M 



Mi 


M 2 I = 


30 3 

8 -2 

5 30 

6 8 


= -84 

= • 140 
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Example 2 


Therefore, by virtue of ( 5 . 18 ), we can immediately write 

x ‘=w = ^H and x|= 

Find the solution of the equation system 


M2! 

Ml 


-140 

-28 


7 xi - x 2 - X3 = 0 
10xi -2x2+ *3 = 8 
6x1 ■+• 3^2 - 2x3 = 7 


The relevant determinants M| and Mj are found to be 



7 

-1 

- 1 | 


0 

-1 -1 

A\ = 

10 

-2 

1 | = -61 

Mil = 

8 

-2 1 


6 

3 

-2l 


7 

3 -2 



7 

0 

-1 

| 


7 

-1 

0 

M2I = 

10 

8 

1 

II 

cc 

CO 

m = 

10 

-2 

8 


6 

7 

-2 



6 

3 

7 


thus the solution values of the variables are 





183 




-61 



-244 

^61~ 


= 4 


Notice that in each of these examples wo find \A I/O. This is a necessary condition for 
the application of Cramer’s rule, as it is for the existence of the inverse A . Cramer’s rule 
is, after all, based upon the concept of the inverse matrix, even though in practice it by¬ 
passes the process of matrix inversion. 

Note on Homogeneous-Equation Systems 

The equation systems Ax = d considered before can have any constants in the vector d. [f 
d = 0, that is, if di =d 2 = ■ ■■ = d„ = 0, however, the equation system will become 

Ax — 0 

where 0 is a zero vector. This special case is referred to as a homogeneous-equation system. 
The word homogeneous here relates to the property that when all the variables, xi...., ,t„ 
are multiplied by the same number, the equation system will remain valid. This is possible 
only if the constant terms of the system—those unattached to any x, arc all zero. 

If the matrix A is nonsingular, a homogeneous-equation system can yield only a '“trivial 
solution." namely x, =x 2 ’ = • • • =x* - 0. This follows from the fact that the solution 
x 1 * = A~ ] d will in this case become 

x' = A~ l 0 = 0 

<»* n (nx)) 

Alternatively, this outcome can be derived from Cramer's rule. The fact that d = 0 implies 
that \ Aj |, for all /, must contain a whole column of zeros, and thus the solution will turn 
out to be 

t , = l^ = _0_ = 

•' Ml Ml 


0 (./ = !,2, 
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Curiously enough, the only way to get a wow trivial solution from a homogeneous- 
equation system is to have \A \ = 0 ? that is, to have a singular coefficient matrix A Mn that 
event, we have 

r , _ LM _ ° 

' \A\ 0 

where the 0/0 expression is not equal to zero but is, rather, something undefined. Conse¬ 
quently, Cramer's rule is not applicable. This does not mean that we cannot obtain solu¬ 
tions; it means only that we cannot get a unique solution. 

Consider the homogeneous-equation system 

tfll.Vl = 0 (5 19) 

<7;i.V'i -b «22' r 2 = 0 

It is self-evident that .rf =.r 2 ‘ = 0 is a solution, but that solution is trivial. Now. assume that 
the coefficient matrix A is singular, so that \A\ = 0. This implies that the row' vector 
|)j u a 12 ] is a multiple of the row vector [t^i <' 22 ]; consequently, one of the two equa¬ 
tions is redundant. By deleting, say. the second equation from (5.19), wc end up with one 
(the first) equation in two variables, the solution of which is ** = (-u\ihh\ U'C This 
solution is nontrivial and well defined if ezi 1 ^ 0, butit really represents an infinite number 
of solutions because, for every possible value of.Vj, there is a corresponding value x* 
such that the pair constitutes a solution. Thus no unique nontrivial solution exists for this 
homogeneous-equation system. This last statement is also generally valid for the n -variable 
case. 

Solution Outcomes for a Linear-Equation System 

Our discussion of the several variants of the linear-equation system Ax = d reveals that as 
many as four different types of solution outcome are possible. For a better overall view of 
these variants, we list them in tabular form in Table 5.1. 

As a first possibility, the system may yield a unique, nontrivial solution. This type of 
outcome can arise onlv when we have a nonhomogeneous system with a nonsingular cocl- 
licient matrix A. The second possible outcome is a unique, trivial solution, and this is 


TABLE 5.1 
Solution 
Outcomes 
lor 0 Linear- 
Lquatiou 
System Ax = d 


Vector d 

d# 0 .. d = 0 

Determinant |A| (nonhomogeneous system) (homogeneous system) 

|>4|j£0 There exists a unique, There exists a unique, 

(matrix A nonsingular) nontrivial solution x* f 0. trivial solution x* = 0. 


(matrix A singular) 

Equations dependent There exist an infinite There exist an infinite 

number of solutions (not number of solutions 
including the trivial one), (including the trivial one). 


Equations inconsistent No solution exists, 


[Not possible.] 
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associated with a homogeneous system with a nonsingular matrix A. As a third possibility, 
we may have an infinite number of solutions. This eventuality is linked exclusively to a sys¬ 
tem in which the equations arc dependent (i.e., in which there are redundant equations). 
Depending on whether the system is homogeneous, the trivial solution may or may not be 
included in the set of infinite number of solutions. Finally, in the case of an inconsistent 
equation system, there exists no solution at ail. From the point of view of a model builder, 
the most useful and desirable outcome is, of course, that of a unique, nontrivial solution 
x* ± 0. 


EXERCISE 5.5 

1 . Use Cramer’s rule to solve the following equation systems: 

(a) 3xi - 2 x 2 - 6 (c) 8xi -7x2 = 9 

2xi + *2 = 11 xi + x 2 = 3 

(b) -x-i + 3 x 2 = -3 (d) 5*i + 9x 2 = 14 

4xi - X 2 = 12 7xi - 3x2 = 4 

2. For each of the equation systems in Prob. 1, find the inverse of the coefficient matrix, 
and get the solution by the formula x’ = A ~ 1 d. 

3. Use Cramer's rule to solve the following equation systems: 

(o) 8 x!-X 2 =16 (c) 4x+3y-2z=1 

2 x 2 + 5x3= 5 x + 2y -6 

2xi +3x 3 = 7 3x +2 = 4 

(b) -xi + 3x2 + 2 x 3 = 24 (cf) -x + y + z= a 

x, + x 3 = 6 x-y + z= b 

5x2- xj=8 x+y—z=c 

4. Show that Cramer's rule can be derived alternatively by the following procedure. Mul¬ 
tiply both sides of the first equation in the system Ax = d by the cofactor |Ci/|, and 
then multiply both sides of the second equation by the cofactor |C 2 ; I, etc. Add all the 
newly obtained equations. Then assign the values 1, 2 ,..., n to the index j, succes¬ 
sively, to get the solution values x*, x‘,,,,, x* as shown in (5,17). 


5.6 Application to Market and National-Income Models 

Simple equilibrium models such as those discussed in Chap. 3 can be solved with ease by 
Cramer’s rule or by matrix inversion. 

Market Model 

The two-commodity model described in (3.12) can be written (after eliminating the quan¬ 
tity variables) as a system of two linear equations, as in (3.13'): 


C\ P\ + i'2 Pi = —<To 
Vi Pi + Y2 Pi = -Yi\ 
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The three determinants needed—,4|, \ .4] |, and \A 2 \—have the following values: 


Ml = 

q 

Yi 

£'2 

Y2 

Mil - 

-eo 

C’2 

-yo 

Y2 

M 2 I = 

q 

—A 1 


Yi 

-To 


Therefore the equilibrium prices must be 
p K _ Mi_l _ c '2Yn ~ T 0 K 2 

1 1-41 C\Y2-c 2 y] 


— c iYl - C 2 Y] 

= -£’ 0^2 + C 2 Y 0 
= -CiYa+coYi 


P 2 * 


\A 2 

Ml 


CqYi -C 1 Y 0 
C'l/2-£'2/l 


which are precisely those obtained in (3.14) and (3.15). The equilibrium quantities can be 
found, as before, by setting P\ = P* and P 2 = P* in the demand or supply functions. 


National-Income Model 

The simple national-income model cited in (3.23) can also be solved by the use of Cramer’s 
rule. As written in (3.23), the model consists of the following two simultaneous equations: 

Y — C + /) + G'o 

C = a + bY (a > 0, 0 < b < 1) 

These can be rearranged into the form 

Y - C = 4 + G 0 


-bY + C = a 


so that the endogenous variables Y and C appear only on the left of the equals signs, 
whereas the exogenous variables and the unattached parameter appear only on the right. 


The coefficient matrix now takes the form 

4 + Go 

a 

i.e., a single element in the constant vector. 

Cramer's rule now leads immediately to the following solution: 


, and the column vector of 


constants (data). 


1 -1 
-b 1 

. Note that the sum 4 + G () is considered as a single entity, 


(4 + Go) - 

a 

-1 

k + Gi> + a 


1 -1 

-b 1 

1 -b 


1 (4 +Go) 
—b a 

a + b{fo + Co) 


1 -1 
-b 1 

1 -h 


You should check that the solution values just obtained are identical with those shown in 
(3.24) and (3.25). 


Let us now try to solve this model by inverting the coefficient matrix. Since the 


coefficient matrix is A = 


1 - 

-b 


its cofaetor matrix is 


. ■ 


and we therefore 
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have adj A = 


1 

b 


. It follows that the inverse matrix is 



We know that, for the equation system Ax = d, the solution is expressible as x* = A ] d. 
Applied to the present model, this means that 


Y *" 

1 

'i r 

At -+ G'o" 

1 

h + Co + a 

c* 

1 -b | 

b 1 

, J 

“ 1 -b 

m b(li>+ G<})+a m 


It is easy to see that this is again the same solution as obtained before. 


IS-LM Model: Closed Economy 

As another linear model of the economy, we can. think of the economy as being made up of 
two sectors; the real goods sector and the monetary sector. 

The goods market involves the following equations: 

Y=C+I+G 
C = a +b(l- i)Y 
[ = d -ei 
G = Gq 

The endogenous variables arc Y, C, I , and i (where i is the rate of interest). The exogenous 
variable is Go, while a, d, e, b, and t are structural parameters. 

In the newly introduced money market, we have; 

Equilibrium condition: Mj = M, 

Money demand: = kY — li 

Money supply: M s = Mo 

where Mo is the exogenous stock of money and k and / are parameters. These three equa¬ 
tions can be condensed into: 


M 0 = kY-!i 

Together, the two sectors give us the following system of equations: 

Y - C - I = Go 
b(l - t)Y - C =-a 

I +ei —d 
kY — li = Mo 

Note that by further substitution the system could be further reduced to a 2 x 2 system 
of equations. For now, we will leave it as a 4 x 4 system. In matrix form, we have 


1 -1 -1 o" 


' y' 


“Go" 

b(\ -t) -10 0 


C 


-a 

0 Ole 


I 


d 

k 0 0-/ 


_ / _ 


_M 0 _ 
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To find the determinant of the coefficient matrix, we can use Laplace expansion on one 
of the columns (preferably one with the most zeros). Expanding the fourth column, wc find 



1 -1 -1 


1 -1 -1 

A\={-e) 

6(1-0 -1 0 

-/ 

6(1-0 -1 0 


k 0 0 


0 0 1 


-1 

-1 

-/ 

1 -1 

-1 

0 

6 ( 1-0 -1 


= (-*)(*) 

=,eA-/[(-l)-(-l)/)(l-0] 

= ek + l[l - b(l - r)] 

We can use Cramer's rule to find the equilibrium income Y*. This is done by replacing 
the first column of the coefficient matrix A with the vector of exogenous variables and tak¬ 
ing the ratio of the determinant of the new matrix to the original determinant, or 

G 0 ’1 -1 0 

-a -l 00 
d Ole 
Mq 00-1 


Y* = 


\A 


\A\ ek + l[l - b(l - 0] 

Using Laplace expansion on the second column of the numerator produces 


r = 





a 

0 0 




Go -1 0 

{-0( 

—l) 3 

d 

1 e 


{-D(-i r 

d 1 e 



Mo 

0 -/ 

i 



Mo 0 -/ 

ek 

Yl[ 1 

— 

6(1 

-01 

* -f 

ek+l[l 

- 6(1 - 01 

—a 

0 

0 


G'o - 

-1 

0 



d 

1 

e 

— 

d 

1 

e 



M 0 

0 - 

-i 


Mo 

0 

-i 




ek+l[ 1 -6(1 -01 
By further expansion, wc obtain 


( 1 ) 


Y* = 


-a 0 

-/ 


— I (—1)(—1) 


d 

-Mo 


e 

-l 


+ (-D 4 


fin 

Mo 


0 

-/ 


ek + /[l -6(1-0] 

_ al - [d(-l ) - fM 0 ] - <jq(—J) 
ek + l[\-b{\-t)\ 

_ l(a+d Go) +eMa 
ek + l[ 1 - 6(1 - 01 

Since the solution to Y * is linear with respect to the exogenous variables, we can rewrite 
r as 


r 


e 

ek+l[ 1 -6(1-01 


Mu + 


l 

eic + l[i-b(l-t)] 


(«+</ + Go) 
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In this form, we can see that the Keynesian policy multipliers with respect to the money 
supply and government expenditure are the coefficients of M () and G’o. that is, 


Money-supply multiplier: 
and 

Government-expenditure multiplier: 


e 

7k + l[\-b{] -r)] 
/ 

d+'l[l-b(\ -0] 


Matrix Algebra versus Elimination of Variables 

The economic models used for illustration above involve two or four equations only, and 
thus only fourth or lower-order determinants need to be evaluated. For large equation sys¬ 
tems, higher-order determinants will appear, and their evaluation will be more compli¬ 
cated, And so wili be the inversion of large matrices. From the computational point of view; 
in fact, matrix inversion and Cramer’s rule are not necessarily more efficient than the 
method of successive eliminations of variables. 

However, matrix methods have other merits. As we have seen from the preceding 
pages, matrix algebra gives us a compact notation for any linear-equation system, and 
also furnishes a dcterminuntal criterion for testing the existence of a unique solution. These 
are advantages not otherwise available. In addition, it should be noted that, unlike the 
elimination-of-variable method, which affords no means of analytically expressing 
the solution, the matrix-inversion method and Cramer's rule do provide the handy solution 
expressions x* = A~'d and xj = \Aj\j\A\. Such analytical expressions of the solution arc 
useful not only because they arc in themselves a summary statement of the actual solution 
procedure, but also because they make possible the performance of further mathematical 
operations on the solution as written, if called for. 

Under certain circumstances, matrix methods can even claim a computational advan¬ 
tage. such as when the task is to solve at the same time several equation systems having 
an identical coefficient matrix A but different constant-term vectors. In such cases, the 
elimination-of-variable method would require that the computational procedure be re¬ 
peated each time a new equation system is considered. With the matrix-inversion method, 
however, we are required to find the common inverse matrix A~ ] onlv once; then the same 
inverse can be used to premultiply all the constant-term vectors pertaining to the various 
equation systems involved, in order to obtain their respective solutions. This particular 
computational advantage will take oil great practical significance when we consider the 
solution of the Leontief input-output models in Sec. 5.7. 


EXERCISE 5.6 

1, Solve the national-income model in Exercise 3.5-1: 
(a) By matrix inversion (b) By Cramer’s rule 
(List the variables in the order Y, C, T.) 

2. Solve the national-income model in Exercise 3-5-2: 
(a) By matrix inversion (h) By Cramer's rule 
(List the variables in the order Y, C, C.) 
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i. Let the IS equation be 


A 




-b 


1 -b 


where 1 - b is the marginal propensity to save, g is the investment sensitivity to inter¬ 
est rates, and A is an aggregate of exogenous variables. Let the LM equation be 


where k and / are income and interest sensitivity of money demand, respectively, and 
Mo is real money balances. 

If b = 0.7, g = 100, A = 252, k = 0.25, / = 200, and M 0 = 176, then 

(a) Write the IS-LM system in matrix form. 

(b) Solve for Y and i by matrix inversion. 


5,7 Leontief Input-Output Models _ 

In its “static” version, the input-output analysis of Professor Wassily Leontief. a Nobel 
Prize winner, 1 ' deals with this particular question: “What level of output should each of the 
n industries in an economy produce, in order that it will just be sufficient to satisfy the total 
demand for that product?” 

The rationale for the term input-output analysis is quite plain to see. The output of any 
industry (say, the steel industry) is needed as an input in many other industries, or even for 
that industry itself; therefore the "correct” (i.e.. shortage-free as well as surplus-free) level 
of steel output will depend on (he input requirements of all the n industries. In turn, the out¬ 
put of many other industries will enter into the steel industry as inputs, and consequently 
the “correct" levels of the other products will in turn depend partly upon the input require¬ 
ments of the steel industry. In view of this interindustry dependence, any set of “correct” 
output levels for the n industries must be one that is consistent with all the input require¬ 
ments in the economy, so that no bottlenecks will arise anywhere. In this light, it is clear 
that input-output analysis should be of great use in production planning, such as in plan¬ 
ning for the economic development of a country or fora program of national defense. 

Strictly speaking, input-output analysis is not a form of the general equilibrium analysis 
as discussed in Chap. 3. Although the interdependence of the various industries is enipha- 
si7ed. the “correct” output levels envisaged are those which satisfy technical input-output 
relationships rather than market equilibrium conditions. Nevertheless, the problem posed 
in input-output analysis also boils down to one of solving a system of simultaneous equa¬ 
tions. and matrix algebra can again be of service. 

Structure of an Input-Output Model 

Since an input-output model normally encompasses a large number of industries, its frame¬ 
work is of necessity rather involved. To simplify the problem, the following assumptions arc 
as a rule adopted: (1) each industry produces only one homogeneous commodity (broadly 
interpreted, this docs permit the case of two or more jointly produced commodities. 

f Wassily W. Leontief, The Structure oiAmerican Economy 1919-1939, 2d ed., Oxford University Press, 
Fair Lawn, N.|., 1951. 
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TABLE 5.2 






Input- 



Output 


Coefficient 

Input 

1 

II 

Ill 

N 

Matrix 


— 





1 

on 

012 

013 '• 

01 n 


II 

02! 

022 

023 ' • ’ 

^2rr 


III 

055 

On 

O33 ' • 

a-in 


N 

! 

Lcfoi 

On! 

On3 ‘- 

O m 


provided they arc produced in a fixed proportion to one another); (2) each industry uses a 
fixed input ratio (or factor combination) for the production of its output; and (3) production 
in every industry is subject to constant returns to scale, so that a A-fold change in every 
input will result in an exactly A-fold change in the output. These assumptions are* of course* 
unrealistic, A saving grace is that, if an industry produces two different commodities or uses 
two different possible factor combinations, then that industry may—at least conceptually— 
be broken down into two separate industries. 

From these assumptions we sec that, in order to produce each unit of Ihe/th commodity, 
the input need for the *lh commodity must be a fixed amount, which we shall denote by a,,. 
Specifically, the production of each unit of the jth commodity will require a\, (amount) of 
the first commodity, ay of the second commodity,..., and a n , of the /ith commodity. (The 
order of the subscripts in a,j is easy to remember: The first subscript refers to the input, and 
the second to the output, so that ay indicates how much of the rth commodity is used for the 
production of each unit of the jth commodity.) For our purposes, we may assume prices to 
be given and, thus, adopt "a dollar’s worth' 1 of each commodity as its unit. Then the state¬ 
ment U32 = 0.35 will mean that 35 cents’ worth of the third commodity is required as an 
input for producing a dollar’s worth of the second commodity. The symbol w ill be re¬ 
ferred to as an input coefficient. 

For an ^-industry economy, the input coefficients can be arranged into a matrix 
A — [a,/], as in Table 5.2, in which each column specifies the input requirements for the 
production of one unit of the output of a particular industry, The second column, for exam¬ 
ple, states that to produce a unit (a dollar’s worth) of commodity II, the inputs needed are: 
a \2 units of commodity I, a 22 units of commodity II, etc. If no industry uses its own prod¬ 
uct as an input, then the elements in the principal diagonal of matrix A will all be zero. 

The Open Model 

If the n industries in Table 5.2 constitute the entirety of the economy, then all their products 
would be for the sole purpose of meeting the input demand of the same n industries (to be 
used in further production) as against the final demand (such as consumer demand, not for 
further production). At the same time, all the inputs used in the economy would be in the 
nature of intermediate inputs (those supplied by the n industries) as against primary inputs 
(such as labor, not an industrial product). To allow for the presence of final demand and pri¬ 
mary inputs, we must include in the model an open sector outside of the //-industry net¬ 
work, Such an open sector can accommodate the activities of the consumer households, the 
government sector, and even foreign countries. 

In view of the presence of the open sector, the sum of the elements in each column of 
the inpul-coefficient matrix A (or input matrix A, for short) must be less than I. Each 
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column sum represents the partial input cost (not including die cost of primary inputs) 
incurred in producing a dollar's worth of some commodity; if this sum is greater than or 
equal to Si, therefore, production will not be economically justifiable. Symbolically, this 
fact may be stated thus; 

n 

<1 (/ = 1. 2, 

1= 1 

where the summation is over /, that is, over the elements appearing in the various nws of a 
specific column /. Carrying this line of thought a step further, it may also be stated that, 
since the value of output ($1) must be fully absorbed by the payments to all factors of 
production, the amount by which the column sum falls short of SI must represent the pay¬ 
ment to the primary inputs of the open sector. Thus the value of the primary inputs needed 

in producing a unit of theyth commodity should be 1 — 

If industry I is to produce an output just sufficient to meet the input requirements of the 
n industries as well as the final demand of the open sector, its output level *i must satisfy 
the following equation: 

Xl = (7i [.V[ + U\jX2 + ■ ' ■ + d\„x„ + d\ 

where d\ denotes the final demand for its output and ti\jXj represents the input demand 
from theyth industry. 1 By the same token, the output levels of the other industries should 
satisfy the equations 

X2 = Qj\ x \ + + • ' • + tJlnX H + d 2 


x n — ll u\ x 1 + a n2 x 2 H-+ ^nn x n + d n 

After moving all terms that involve the variables x, to the left of the equals signs, and leav¬ 
ing only the exogenously determined final demands dj on the right, we can express the 
"correct” output levels of the n industries by the following system of n linear equations: 


(l-Un)xi- a u x 2 - = d] 

-«21*l -Kl - fl 22)*2 - 02 n .x n = d 2 ( 5 . 20 ) 

—a„\X\ - u,i 2 x 2 - - Ml— d„„)x„ = d n 


In matrix notation, this may be written as 


(1—a,,) —a 12 
-«2L (l “ 022) • ‘ • 


! 

- -1 

l 

Xi ! 

= 

d\ 

dr> 

1 ’ ’ ’ 

(1 -a«n)_ 




-1 

' _ ^5 

_1 


If the is in the principal diagonal of the matrix on the left are ignored, the matrix is 
simply - A = [—a,,]. As it is, on the other hand, the matrix is the sum of the identity matrix 


1 Do not ever add up the input coefficients across a row; such a sum—say, on + 012 f- 1 - Oi„— 

is devoid of any useful economic meaning. The sum of the products 011*1 ^ 012*2 h — + o\„x n , 
on the other hand, does have an economic meaning; it represents the total amount of x, needed as 
input for all the n industries. 
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/„ (with Is in its principal diagonal and with Os everywhere else) and the matrix —/l. Thus 
(5.20') can also be written as 

(I-A)x=d (5.20") 

where x and d are. respectively, the variable vector and the final-demand (constant-term) 
vector. The matrix I - A is called the Leontief matrix. As long as / - A is nonsingular, we 
shall be able to find its inverse (1 - 4) _l , and obtain the unique solution of the system 
from the equation 

x % ={l-A)~ ] d (5.21) 


A Numerical Example 


For purposes of illustration, suppose that there are only three industries in the economy and 
one primary input, and that the input-coefficient matrix is as follows (let us use decimal 
values this time): 


an 

*12 

*13 


0.2 

0.3 

0.2" 

*2! 

*22 


— 

0.4 

0.1 

0.2 

.*31 

*32 

* 3 . 3 . 


.0.1 

0.3 

0.2 


(5.22) 


Note that each column sum in A is less than 1, as it should he. Further, if wc denote by % 
the dollar amount of the primary input used in producing a dollar's worth of the yth eom- 
modity, wc can write [by subtracting each column sum in (5.22) from 1]: 


tfgj = 0.3 a $2 = 0.3 and a 03 = 0.4 (5.23) 


With the matrix A of (5.22), the open input-output system can be expressed in the form 
(/ - A)x = d as follows: 


0.8 -0.3 -0.2 


*1 


d 1 

-0.4 0.9 -0.2 


*2 i 

= 

d 2 

.-0.1 -0.3 0.8. 


J 


Ld 3 _ 


(5.24) 


Here we have deliberately not given specific values to the final demands d \, d 2 , and dy. In 
this way, by keeping the vector din parametric form, our solution will appear as a “formula” 
into which we can feed various specific d vectors to obtain various corresponding specific 
solutions. 

By inverting the 3 x 3 Leontief matrix, the solution of (5.24) can be found, approxi¬ 
mately (because of rounding of decimal figures), to be 


J 


= {l-A)~ l d = 


0.384 


0.66 

0.30 

0.24 

0.34 

0.62 

0.24 

0.21 

0.27 

0.60 


d x 

d 2 

dy 


If the specific final-demand vector (say, the final-output target of a development program) 
10 , 

, in billions of dollars, then the following specific solution values 


happens to be d = | 5 
1 6 

will emerge (again in bil 
1 


x‘ = 


0.384 


tons of dollars): 

[0.66(10) + 0.30(5) + 0.24(6)] = 


9.54 

0384 


= 24.84 
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An important question now arises. The production of the output mix .vf. x 2 , and xj must 
entail a definite required amount ol'lhe primary input. Would the amount required be con¬ 
sistent with what is available in the economy? On the basis of (5.23). the required primary 
input may be calculated as follows: 

= 0.3(24.84) + 0.3(20.63) + 0.4(18.36) =$21.00 billion 

" l0 l 

Therefore, the specific final demand d = 5 will be feasible i 1 and only if the available 

6 

amount of the primary input is at least S21 billion. If The amount available falls short, then 
that particular production target will, of course, have to be revised downward accordingly. 

One notable feature of the previous analysis is that, as long as the input coefficients 
remain the same, the inverse (/ - A)~' will not change; therefore only one matrix inver¬ 
sion needs to be performed, even if we are to consider a hundred or a thousand different 
final-demand vectors—such as a spectrum of alternative development targets. This econo¬ 
mies the computational effort as compared with the clinhnation-of-variable method. How¬ 
ever, this advantage is not shared by Cramer’s rule as outlined in (5.18), because each time 
a different final-demand vector d is used, we must calculate a new determinant as the nu¬ 
merator in (5.18), which is not as simple as multiplying a known inverse matrix (/ - A)~' 
by a new vector d. 


The Existence of Nonnegative Solutions 

In the previous numerical example, the Leontief matrix l — A happens to he nonsingular, 
so solution values of output variables Xj do exist. Moreover, the solution values .v* all turn 
out to bo nonnegative, as economic sense would dictate. Such desired results, however, 
cannot be expected to emerge automatically; they come about only when the Leontief 
matrix possesses certain properties. These properties are described in the so-called 
Hawkinx-Sirnon condition , f 

To explain this condition, we need to introduce the mathematical concept of principal 
minors of a matrix, because the algebraic signs of principal minors will provide important 
clues in guiding our analytical conclusions. We already know that, given a square matrix, 
say, B, with determinant |fi|, a minor is a subdeterminant obtained by deleting the /th row 
and /th column of |S|, where i and j are not necessarily equal, If wc now impose the re¬ 
striction that i = j , then the resulting minor is known as a principal minor. For example, 
given a 3 x 3 matrix 5, we can write its determinant generally as 

b ii b 12 b n 

\B\ = i>21 b 2 2 ^23 (5-25) 

bit bn bn 


* David Hawkins and Herbert A. Simon, "Note: Some Conditions of Macroeconomic Stability, 
Econometrica, |uly-October, 1949, pp. 245-48. 
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The simultaneous deletion of the /th row and the ith column (i = 3, 2. 1, successively) 
results in the following three 2x2 principal minors: 


bu b u 


bw bu 


bn bn 

b 2 \ b 2 2 


Ih i bn 


by 2 by} 


In view of their 2x2 dimensions, these arc referred to as second-order principal minors. 
We can also generate first-order principal minors (1 x 1) by deleting any two rows and the 
same-numbered columns from \B |. They are 

|6,iI = 6jl \b 2 2\=bn \hu\ = h y (5.27) 


Finally, to complete the picture, we can consider \B\ itself as the third-order principal 
minor of |S|. Note that in all the minors listed in (5.25) through (5.27), their principal- 
diagonal elements consist exclusively of the principal-diagonal elements of B. Herein lies 
the rationale for the name “principal minors.” 1 

While certain economic applications require checking the algebraic signs of all the prin¬ 
cipal minors of a matrix S, quite often our conclusion depends only on the sign pattern of 
a particular subset of the principal minors referred to variously as the leading principal mi¬ 
nors, naturally ordered principal minors, m successive principal minors. In the 3x3 case, 
this subset consists only of the first members of (5.25) through (5.27): 


5,l = l&u 


bn b] 2 
bi i b 2 2 


b ii b\ 2 bn 
b 2 1 b 2 i bn 
k ,i bn bn 


(5.28) 


Here, the single subscript min the symbol |£,„|, unlike in the subscript usage in the context 
of Cramer's rule, is employed to indicate that the leading principal minor is of dimension 
m x m. An easy way to derive the leading principal minors is to section off the determinant 
\B\ with the successive broken lines as shown: 


*11 ; 

bn 

b\3\ 

h i 

bn 

b 2 3 : 

i 

*33 

b} 2 

bn i 


(5.29) 


Taking the top element in the principal diagonal of | B | by itself alone gives us | B\ |; taking 
the first two elements in the principal diagonal, ft n and bn, along with their accompanying 
off-diagonal elements yields |# 2 [: and so forth. 


' An alternative definition of principal minors would allow for the various permutations of the 
subscript indices i, j, and k. This would mean, in the input-output context, the renumbering of the 
industries (e.g„ the first industry becomes the second industry, and vice versa, so that the subscript 
11 becomes 22, and the subscript 22 becomes 11, and so on). As a result, in addition to the 2x2 
principal minors in (5.26), we would also have 


bn 

t>21 i 

by 3 

bn 

bn 

bn 

*13 

bn 


t>33 b 32 
t>23 b 2 ? 


But these last three, in the order given, exactly match the three listed in (5.26) in value and algebraic 
sign; thus they can be omitted from consideration for our purposes. Similarly, even though the 
permutation of subscript indices can generate additional 3x3 principal minors, they merely 
duplicate the one in (5.25) in value and sign, and thus can also be disregarded. 
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Given a higher-dimension determinant, say. n x n, there will of course be a larger num¬ 
ber of prineipal minors, but the pattern of their construction is the same. A fcth-order prin¬ 
cipal minor is always obtained by deleting any n — k rows and the same-numbered columns 
from |5|. And its leading principal minors |S,„| (with m = 1.2,.... n) arc always formed 
by taking the first m principal-diagonal elements in \ B\ along with their accompanying off- 
diagonal elements. 

With this background, we are ready to state the following important theorem due to 
Hawkins and Simon: 

Given (u) an/i x n matrix B. with h,, < 0 (/ ^ j) (i.e., with all off-diagonal dements non¬ 
positive). and (£0 an n x 1 vector d > 0 (all elements nonnegative), there exists an « x 1 
vector.!'' > 0 such that B. r* = </, ifand only if 

\B m \ > (I {m = 1, 2,.... n) 

i.c., if and only if the leading principal minors of flare all positive. 

The relevance of this theorem to input-output analysis becomes dear when we let B repre¬ 
sent the Leontiefmatrix I - A (where 6,, = — a,j for/ ^ j arc indeed all nonpositive), and 
d, the final-demand vector (where all the elements are indeed nonnegative). Then Fix* = d 
is equivalent to (I - /l).r* = d , and the existence of*’ > 0 guarantees nonnegative solu¬ 
tion output levels. The necessary-and-sufficient condition for this, known as the Hawkins- 
Simon condition, is that all the principal minors of the Leontief matrix I - Abe positive. 

The proof of this theorem, is too lengthy to be presented here," but it should be worth¬ 
while to explore its economic meaning, which is relatively easy to see in the simple two- 
industry case (/? = 2). 

Economic Meaning of the Hawkins-Simon Condition 

For the two-industry case, the Leontief matrix is 

I - A = 1 ” U[] ~ a[2 

|_ -02) 1-^22. 

The first part of the Hawkins-Simon condition, \B\ I > 0, requires that 

1 — a \i > 0 or a\\ < i 

Economically, this requires the amouni of the first commodity used in the production of a 
dollar's worth of the first commodity to be less than one dollar. The other part of the condi¬ 
tion, \Bi\ > 0, requires thyt 

(I - aw )(1 - (I 22 ) 12021 > 0 

f A thorough discussion can be found in Akira Takayama, Mathematical Economics, 2d ed v Cambridge 
University Press, 1985, pp. 380-385. 

Some writers use an alternative version of the Hawkins-Simon condition, which requires all the 
principal minors of B(not only the leading ones) to be positive. As Takayama shows, however, in 
the present case, with the special restriction on |B|, it happens that requiring the positivity of the 
leading principal minors (a less stringent condition) can achieve the same result. Nevertheless, it 
should be emphasized that, as a general rule, the fact that the leading principal minors satisfy a 
particular sign requirement does not guarantee that all the principal minors automatically satisfy that 
requirement, too. Hence, a condition stated in terms of 0 //the principal minors must be checked 
against ali the principal minors, not only the leading ones. 



Chapter 5 Linear Models and Matrix Algebra tContinued) 119 


or, equivalently. 


"ll +"I2"2I +0 -"ll)"22 < • 

Further, since (1 — «i i) "22 is positive, the previous inequality implies that 

"11 +a [2 a 2 \ < 1 

Economically,«, 1 measures the direct use of the first commodity as input in the production 
of the first commodity itself, and «i 2 « 2 i measures the indirect use—it gives the amount of 
the first commodity needed in producing the specific quantity of the second commodity 
that goes into the production of a dollar’s worth of the first commodity. Thus the last in¬ 
equality mandates that the amount of the first commodity used as direct and indirect inputs 
in producing a dollar’s worth of the commodity itself, must be less than one dollar. Thus, 
what the Hawkins-Simon condition does is to specify certain practicability and viability re¬ 
strictions for the production process. If and only if the production process is economically 
practicable and viable, can it yield meaningful, nonnegative solution output levels. 

The Closed Model 

If the exogenous sector of the open input-output model is absorbed into the system as just 
another industry, the model will become a closed model. In such a model, final demand and 
primary input do not appear; in their place will he the input requirements and the output of 
the newly conceived industry. All goods will now be intermediate in nature, because every¬ 
thing that is produced is produced only for the sake of satisfying the input requirements 
of the (w + 1) industries in the model. 

At first glance, the conversion of the open sector into an additional industry would 
not seem to create any significant change in the analysis. Actually, however, since the new 
industry is assumed to have a fixed input ratio as does any other industry, the supply of what 
used to be the primary input must now bear a fixed proportion to what used to be called the 
final demand. More concretely, this may mean, for example, that households will consume 
each commodity in a fixed proportion to the labor service they supply. This certainly con¬ 
stitutes a significant change in the analytical framework involved. 

Mathematically, the disappearance of the final demands means that we will now have a 
homogeneous-equation system. Assuming four industries only (including the new one. des¬ 
ignated by the subscript 0), the “correct” output levels will, by analogy to (5.20'), be those 
which satisfy the equation system: 


(l-tfoo) —Ooi -"02 -"03 


.to 


0 

-flu, (l-fl[|) —"|2 -"13 


X) 

_ 

0 

— "2(1 -"21 (1 — «22) -"23 


■ *2 


0 

. -"30 -"31 -"32 (1— "33> _ 


-•<3. 


. 0 . 


Because this equation system is homogeneous, it can have a nontrivial solution if and only 
if the 4 x 4 Leontief matrix I - A has a vanishing determinant. The latter condition is 
indeed always satisfied: In a closed model, no primary input exists; hence each column sum 
in the input-coefficient matrix A must now be exactly equal to (rather than less than) I; that 

is, j + U\j + Qjj H" £73/ = I . Of 

"0/ = 1 / -ettj 
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But this implies that, in every column of the matrix l - A, given previously, the top ele¬ 
ment is always equal to the negative of the sum of the other three elements. Consequently, 
the four rows are linearly dependent, and we must find \I - A\ = 0. This guarantees that 
the system does possess nontrivial solutions; in fact, as indicated in Table 5,1, it has an 
infinite number of them. This means that in a closed model, with a homogencous-cqualion 
system, no unique “correct” output mix exists. We can determine the output levels 
.v*_ ,*4 in proportion to one another, but cannot (ix their absolute levels unless addi¬ 

tional restrictions arc imposed on the model. 


EXERCISE 5.7 

1. On the basis of the model in (5.24), if the final demands are = 30, cfc = 15, and 

= 10 (all in billions of dollars), what are the solution output levels for the three in¬ 
dustries? (Round off answers to two decimal places.) 

2. Using the information in (5.23), calculate the total amount of primary input required 
to produce the solution output levels of Prob. 1. 

3. In a two-industry economy, it is known that industry I uses 10 cents of its own product 
and 60 cents of commodity M to produce a dollar's worth of commodity I; industry II 
uses none of its own product but uses 50 cents of commodity I in producing a dollar's 
worth of commodity II; and the open sector demands $1,000 billion of commodity I 
and $2,000 billion of commodity IE. 

(a) Write out the input matrix, the Leontief matrix, and the specific input-output 
matrix equation for this economy. 

(b) Check whether the data in this problem satisfy the Hawkins-Simon condition. 

(c) Find the solution output levels by Cramer's rule. 

4. Given the input matrix and the final-demand vector 


'0.05 

0.25 

0.34’ 


’ 1800" 

0.33 

0.10 

0.12 

d = 

200 

0.19 

0.38 

0 

i 

900 


(c) Explain the economic meaning of the elements 0.33, 0, and 200. 

(i b ) Explain the economic meaning (if any) of the third-column sum. 

(c) Explain the economic meaning (if any) of the third-row sum. 

(d) Write out the specific input-output matrix equation for this model. 

(i e ) Check whether the data given in this problem satisfy the Hawkins-Simon condition. 

5. (g) Given a 4 x 4 matrix B = [fa?;], write out all the principal minors. 

(b) Write out alt the leading principal minors. 

6, Show that, by itself (without other restrictions on matrix £1), the Hawkins-Simon condi¬ 
tion already guarantees the existence of a unique solution vector though not nec¬ 
essarily nonnegative. 


5.8 Limitations of Static Analysis __ 

In the discussion of static equilibrium in the market or in the national income, our primary 
concern has been to find the equilibrium values of the endogenous variables in the model. A 
fundamental point that was ignored in such an analysis is the actual process of adjustments 
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and readjustments of the variables ultimately leading to the equilibrium state (if it is at all 
attainable). Wc asked only about where we shall arrive but did not question when or what 
may happen along the way 

The static type of analysis fails, therefore, to take into account two problems of impor¬ 
tance. One is that, since the adjustment process may take a long time to complete, an equi¬ 
librium state as determined within a particular frame of static analysis may have lost its 
relevance before it is even attained, if the exogenous forces in the model have undergone 
some changes in the meantime. This is the problem of shifts of the equilibrium state. The 
second is that, even if the adjustment process is allowed to run its course undisturbed, the 
equilibrium state envisaged in a static analysis may be altogether unattainable. This would 
be the case of a so-called unstable equilibrium, which is characterized by the fact that the 
adjustment process will drive the variables further away from, rather than progressively 
closer to, that equilibrium state. To disregard the adjustment process, therefore, is to as¬ 
sume away the problem of attainability of equilibrium. 

The shifts of the equilibrium state {in response to exogenous changes) pertain to a type 
of analysis called comparative statics and the question of attainability and stability of equi¬ 
librium falls within the realm of dynamic analysis. Each of these clearly serves to fill a sig¬ 
nificant gap in the static analysis, and it is thus imperative to inquire into those areas of 
analysis also. We shall leave the study of dynamic analysis to Part 5 of the book and shall 
next turn our attention to the problem of comparative statics. 












Chapter 


Comparative Statics and 
the Concept of Derivative 



This chapter and Chaps, 7 and 8 will be devoted to the methods of comparative-static 
analysis. 

6,1 The Nature of Comparative Statics _ 

Comparative statics, as the name suggests, is concerned with the comparison of different 
equilibrium states that arc associated with different sets of values of parameters and ex¬ 
ogenous variables. For purposes of such a comparison, we always start by assuming a given 
initial equilibrium state. In the isolated-market model, for example, such an initial equi¬ 
librium will be represented by a determinate price P r and a corresponding quantity Q\ 
Similarly, in the simple national-income model of (3.23). the initial equilibrium will be 
specified by a determinate F and a corresponding C. Now' if we let a disequilibrating 
change occur in the model in the form of a change in the value of some parameter or 
exogenous variable the initial equilibrium will, of course, be upset. As a result, the vari¬ 
ous endogenous variables must undergo certain adjustments. If it is assumed that a new 
equilibrium state relevant to the new values of the data can be defined and attained, the 
question posed in the comparative-static analysis is: How would the new equilibrium com¬ 
pare with the old? 

It should be noted that in comparative statics we still disregard the process of adj ustment 
of the variables; we merely compare the initial (/wehange) equilibrium state with the final 
(/?o.?rchange) equilibrium state. Also, wc still preclude the possibility of instability of equi¬ 
librium. for wo assume the new' equilibrium to be attainable, jus! as we do for the old. 

A comparative-static analysis can be either qualitative or quantitative in nature. If we are 
interested only in the question of, say, whether an increase in investment I,) w ill increase or 
decrease the equilibrium income Y\ the analysis will be qualitative because the direction 
of change is the only matter considered. But if we are concerned with the magnitude of fat 
change in Y* resulting from a given change in /q (that is, the size of the investment multi¬ 
plier), the analysis will obviously be quantitative. By obtaining a quantitative answer, how¬ 
ever, we can automatically tell the direction of change from its algebraic sign. Hence the 
quantitative analysis always embraces the qualitative. 
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It should be clear that the problem under consideration is essentially one of finding a 
rale of change: the rale of change of the equilibrium value of an endogenous variable with 
respect to the change in a particular parameter or exogenous variable. For this reason, the 
mathematical concept of derivative takes on preponderant significance in comparative 
statics, because that concept the most fundamental one in the branch of mathematics 
known as differential calculus -is directly concerned with the notion of rale of change! 
Later on, moreover, we shall find the concept of derivative to be of extreme importance for 
optimization problems as well. 

6.2 Rate of Change and the Derivative _ 

bven though our present context is concerned only with the rates of change of the equilib¬ 
rium values of the variables in a model, we may carry on the discussion in a more general 
manner by considering the rate of change of any variable y in response to a change in 
another variable x, where the two variables are related to each other by the function 

y = m 

Applied to the comparative-static context, the variable v will represent the equilibrium 
value of an endogenous variable, andx will be some parameter. Note that, fora start, we are 
restricting ourselves to the simple case where there is only a single parameter or exogenous 
variable in the model. Once we have mastered this simplified ease, however, the extension 
to the case of more parameters will prove relatively easy. 

The Difference Quotient 

Since the notion of “change” figures prominently in the present context, a special symbol 
is needed to represent it. When the variable x changes from the value x ( , to a new value X[, 
the change is measured by the difference x t - x«. Hence, using the symbol A (the Greek 
capital delta, for “difference”) to denote the change, wc write A.r = v, - a' () . Also needed 
is a way of denoting the value of the function / (x) at various values of x The standard 
practice is to use the notation f(x t ) to represent the value of f(x) when x = x,-. Thus, 
for the function f(x) = 5 + x 2 . we have /(0) = 5 + 0 2 = 5; and similarly, /(2) = 
5 T 2 1 = 9, etc. 

When a: changes from an initial value xq to a new value (xo + Ax), the value of the func¬ 
tion y = f(x) changes from /{xo) to j(xt } 4- Ax). The change in y per unit of change in x 
can be represented by the difference quotient. 

Ay /(xo-b Ax)- f(x Q ) 

At = -Xt- (6 ' ,) 

This quotient, which measures the average rate of change of v, can be calculated ifwc know 
the initial value ofx, or x (l) and the magnitude of change in x, or Ax. That is. Ay/Ax is a 
function ofxo and Ax. 

Example 1 Civen y = ~ 5 * 2 - we can write 

fix o) = 3(*o ) 2 - 4 /(xo + Ax) = 3(x 0 + Ax ) 2 - 4 
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Therefore, the difference quotient is 

A y _ 3(x n + Ax ) 2 -A- (3x 0 2 - 4) _ 6 x 0 Ax + 3(Ax ) 2 
Ax Ax Ax 

= 6 xo + 3 Ax (6.2) 

which can be evaluated if we are given xo and Ax. Let x« = 3 and Ax = 4; then the aver¬ 
age rate of change of y is 6(3) + 3(4) - 30. This means that, on the average, as x changes 
from 3 to 7, the change in y is 30 units per unit change inx. 

The Derivative 

Frequently, wc are interested in the rate of change of>’ when Ax is very small. In such a 
case, it is possible to obtain an approximation of Ay/Ax by dropping all the terms in the 
difference quotient involving the expression Ax. In (6.2), for instance, if Ax is very smal I, 
we may simply take the term 6 x 0 on the right as an approximation of A_y/Ax. The smaller 
the value of Ax, of course, the closer is the approximation to the true value of Ay/Ax. 

As Ax approaches zero (meaning that it gets closer and closer to, but never actually 
reaches, zero), ( 6 x 0 + 3 Ax) will approach the value 6 x 0 , and by the same token, Ay/ Ax- 
will approach 6 x 0 also. Symbolically, this fact is expressed either by the statement 
Ay/Ax -v 6 x 0 as Ax 0, or by the equation 

lim — = lim ( 6 x 0 + 3Ax) = 6 x (! (6.3) 

A.r-rO A.t A.r -+0 

where the symbol lim is read as “The limit of... as Ax approaches 0.” If, as Ax -► 0. 
the limit of the difference quotient Av/Ax indeed exists, that limit is called the derivative 
of the function y = /(x). 

Several points should be noted about the derivative if it exists. First, a derivative is a 
function; in fact, in this usage the word derivative really means a derived function. The 
original function y - f(x) is a primitive function, and the derivative is another function 
derived from it. Whereas the difference quotient is a function of x 0 and Ax, you should 
observe—from (6.3), for instance- that the derivative is a function of xo only. This is 
because Ax i s already compel led to approach zero, and therefore it should not be regarded 
as another variable in the function. Let us also add that so far we have used the subscripted 
symbol x (l only in order to stress the fact that a change in x must start from some specific 
value of x. Now that this is understood, we may delete the subscript and simply state that 
the derivative, like the primitive function, is itself a function of the independent variable x. 
That is, for each value of x. there is a unique corresponding value for the derivative 
function. 

Second, since the derivative is merely a limit of the difference quotient, which measures 
a rate of change ofy, the derivative must of necessity also be a measure of some rate of 
change. In view of the fact that the change inx envisaged in the derivative concept is infin¬ 
itesimal (that is, Ax 0), the rate measured by the derivative is in the nature of an 
instantaneous rate of change. 

Third, there is the matter of notation. Derivative functions are commonly denoted in two 
ways. Given a primitive function y — f(x), one way of denoting its derivative (if it exists) 
is to use the symbol or simply f: this notation is attributed to the mathematician 
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Lagrange. The other common notation is dv/dx, devised by the mathematician Leibniz. 
[Actually there is a third notation, Dp, or Df(x), but wc shall not use it in the following 
discussion.] The notation f\x), which resembles the notation for the primitive function 
f(x), has the advantage of conveying the idea that the derivative is itself'a function of x. 
The reason for expressing it as f{x )—rather than, say, <p(x )—is to emphasize that the 
function /' is derived from the primitive function/ The alternative notation, dyjdx , serves 
instead to emphasize that the value of a derivative measures a rate of change. The letter d is 
the counterpart of the Greek A, and dv/dx differs from Ay/Ax chiefly in that the former is 
the limit of the latter as Ax approaches zero. In the subsequent discussion, we shall use 
both of these notations, depending on which seems the more convenient in a particular 
context. 

Using these two notations, we may define the derivative of a given function v = fix) as 
follows: 


<ly 

dx 


fix) = lim 


Ay 

Ax 


Example 2 Referring to the function y = 3x 2 - 4 again, we have shown its difference quotient to be 

- (6.2), and the limit of that quotient to be (6.3). On the basis of the latter, we may now write 

(replacing xg with x): 


^ = 6x or f'(x ) = 6x 

U A 

Note that different values of x will give the derivative correspondingly different values. For 
instance, when x=3, we find, by substituting x=3 in the f(x) expression, that 
f'(3) = 6(3) = 18; similarly, when x = 4, we have f'(4 ) = 6(4) = 24. Thus, whereas f‘(x) 
denotes a derivative function, the expressions f'(3) and f ( 4) each represents a specific 
derivative value. 


EXERCISE 6.2 

1. Given the function y = 4x 2 + 9: 

(a) Find the difference quotient as a function of x and Ax. (Use x in lieu of xq.) 

(i b ) Find the derivative dy/dx. 

(c) Find f(l) and f’(4). 

2, Given the function y = 5x 2 - 4x: 

(a) Find the difference quotient as a function of x and ax. 

(h) Find the derivative dy/dx. 

(c) Find f'(2) and /'( 3). 

5. Given the function y = 5x - 2: 

(a) Find the difference quotient Ay/Ax, What type of function is it? 

(b) Since the expression Ax does not appear in the function Ay/Ax in part (a), does it 
make any difference to the value of Ay/Ax whether Ax is large or small? Conse¬ 
quently, what is the limit of the difference quotient as Ax approaches zero? 
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6.3 The Derivative and the Slope of a Curve _ 

Elementary economics tells us that, given a total-cost function C = j(Q). where C de¬ 
notes total cost and Q the output, the marginal cost (MC) is defined as the change in total 
cost resulting from a unit increase in output; that is, MC = AC/AQ. It is understood that 
A Q is an extremely small change. For the ease of a product that has discrete units (integers 
only), a change of one unit is the smallest change possible; but for the case of a product 
whose quantity is a continuous variable, AQ, can refer to an infinitesimal change. In this 
latter case, it is well known that the marginal cost can be measured by the slope of the total - 
cost curve. But the slope of the total-cost curve is nothing but the limit of the ratio 
AC/AQ, when A Q approaches zero. Thus the concept of the slope of a curve is merely 
the geometric counterpart of the concept of the derivative. Both have to do with the 
“marginal" notion so extensively used in economics. 

In Fig, 6.1, we have drawn a total-cost curve C. which is the graph of the (primitive) 
function C - /'( Q). Suppose that we consider as the initial output level from which an 
increase in output is measured; then the relevant point on the cost curve is the point A. If 
output is to be raised to Qa + A Q = Qj, the total cost will be increased from Co to 
C„ + AC = c 2 ; thus AC/AQ = (C 2 - C () )/(02 - Qo). Geometrically, this is the ratio 
of two line segments, EB/AE. or the slope of the line AB. This particular ratio measures an 
average rate of change—the average marginal cost for the particular AQ pictured- and 
represents a difference quotient. As such, it is a function of the initial value Qij and the 
amount of change A Q. 

What happens when wc vary the magnitude of AQ? If a smaller output increment is 
contemplated (say, from Qo to Q \ only), then the average marginal cost will be measured 
by the slope of the line AD instead. Moreover, as we reduce the output increment further 
and further, flatter and flatter lines will result until, in the limit (as A Q 0), we obtain the 
line KG (which is the tangent line to the cost curve at points) as the relevant line. The slope 


FIGURE 6.1 
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of KG (= HG/K H) measures the slope of the total-cost curve at point A and represents 
the limit of AC/ A Q, as A Q ->■ 0, when initial output is at Q = Therefore, in terms 
of the derivative, the slope of the C = /( Q) curve at point A corresponds to the particular 
derivative value f f (Qo). 

What if the initial output level is changed from £> 0 to, say, Qj? In that case, point B on 
the curve will replace point A as the relevant point, and the slope of the curve at the new 
point B will give us the derivative value /'( Qi ). Analogous results are obtainable for alter¬ 
native initial output levels. In general, the derivative /'( Q) -a function of ()—will vary as 
Q changes. 


6.4 The Concept of Limit 


The derivative dy/dx has been defined as the limit of the difference quotient Av/A.t as 
A.t -* 0, If we adopt the shorthand symbols q = Av/A.r (q for quotient) and v = A* 
(u for variation in the value of x), we have 

dy A y 

— = urn — = hm a 

dX Ar-^O A* r^O 

In view of the fact that the derivative concept relies heavily on the notion of I imit, it is im¬ 
perative that we get a clear idea about that notion. 


Left-Side Limit and Right-Side Limit 

The concept of limit is concerned with the question: “What value does one variable (say, q) 
approach as another variable (say, i>) approaches a specific value (say, zero)?” In order for 
this question to make sense, q must, of course, be a function of u; say, q = £(r). Our 
immediate interest is in finding the limit of as l> —> 0 , but we may just as easily explore 
the more general case of v -* N, where N is any finite real number. Then, lim q will be 

merely a special case of lim q where jV = 0. In the course of the discussion, we shall 

v >V 

actually also consider the limit of q as v —► H-co (plus infinity) or as v -oo (minus 
infinity). 

When we say v N, the variable v can approach the number N cither from values 
greater than N , or from values less than N. If, as v ->• N from the left side (from values less 
than jY>, q approaches a finite number L we call L the left-side limit of < 7 . On the other hand, 
if L is the number that q tends to as v N from the right side (from values greater than ,V). 
we call L the right-side limit of q. The left- and right-side limits may or may not be equal. 
The left-side limit of q is symbolized by lim q (the minus sign signifies from values 

less than N), and the right-side limit is written as lim q . When—and only when—the two 

t 1 —■ ,v 1 

limits have a common finite value (say, L), we consider the limit of q to exist and write il as 
lim q = L . Note that L must be a finite number. If we have the situation of lim q = 00 

v+N v—* V 

(or - 00 ), we shall consider q to possess nn limit, because lim q = cc means Thai ty ->• oo 

t 1 — jY 

as f —*• N, and if q will assume ever-increasing values as v tends to JV, il would be contra¬ 
dictory to say that q has a limit. As a convenient way of expressing the fact that q -*■ co as 
1 1 —» N, however, some people do indeed write lim q = oo and speak of q as having an 
“infinite limit.” 
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In certain cases, only the limit of one side needs to be considered. In taking the limit of 
q as v -» +oo, for instance, only the left-side limit of q is relevant, because i; can approach 
+00 only from the left. Similarly, for the case of i- —► —oo, only the right-side limit is 
relevant. Whether the limit of q exists in these cases will depend only on whether q 
approaches a finite value as t: -* +oo, or as v —* —oc. 

It is important to realize that the symbol co (infinity) is not a number, and therefore it 
cannot be subjected to the usual algebraic operations. We cannot have 3 + oc or 1 /oc; nor 
can we write q ~ oc, which is not the same as q -» oc. However, it is acceptable to express 
the limit of q as “=" (as against —») oc, for this merely indicates that q oc. 

Graphical Illustrations 

Let us illustrate, in Fig. 6.2, several possible situations regarding the limit of a function 
=g(u)- 

Figure 6.2a shows a smooth curve. As the variable v tends to the value V from either 
side on the horizontal axis, the variable q tends to the value L. In this case, the left-side limit 
is identical with the right-side limit; therefore wc can write lim q = L. 

F—* jV 


FIGURE 6.2 
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The curve drawn in Fig. 6.26 is not smooth; it has a sharp turning point directly above 
the point N. Nevertheless, as tends to N from either side, q again tends to an identical 
value L. The limit of q again exists and is equal to L . 

Figure 6.2c shows what is known as a step function . f In this case, as v tends to ,V, the 
left-side limit of q is /«i. but the right-side limit is a different number. Hence, q docs not 
have a limit as v N. 

Lastly, in Fig. 6.2d, as t 1 tends to jV, the left-side limit of q is — oc, whereas the right-side 
limit is +cc, because the two parts of the (hyperbolic) curve will fall and rise indefinitely 
while approaching the broken vertical line as an asymptote. Again, lim q does not exist. 

t - 1 '/V 

On the other hand, if we are considering a different sort of limit in diagram d, namely, 
lim q , then only the left-side limit has relevance, and we do find that limit to exist: 

V—*-+CC 

lira </= M. Analogously, you can verily that lim q = M as well. 

f—f. 1 —*-Dg 

It is also possible to apply the concepts of left-side and right-side limits to the discussion 
of the marginal cost in Fig. 6.1. In that context, the variables q and u will rel'er, respectively, 
to the quotient AC/AQ and to the magnitude of AQ, with all changes being measured 
from point A on the curve. In other words, q will refer to the slope of such lines as AB, AI), 
and KG, whereas i; will refer to the length of such lines as Qi>Qi (— line AE) and 
QoQt(= line AF), We have already seen that, as v approaches zero from a positive value, 
q will approach a value equal to the slope of line KG. Similarly, we can establish that, if 
A Q approaches zero from a negative value (i.e., as the decrease in output becomes less and 
less), the quotient AC j A Q, as measured by the slope of such lines as RA (not drawn), will 
also approach a value equal to the slope of line KG. Indeed, the situation here is very much 
akin to that illustrated in Fig. 6.2a. Thus the slope of KG in Fig. 6.1 (the counterpart of Z. in 
Fig. 6.2) is indeed the limit of the quotient q as v lends to zero, and as such it gives us the 
marginal cost at the output level Q = Q 0 . 

Evaluation of a Limit 

Let us now illustrate the algebraic evaluation of a limit of a given function q — g(r). 


Example 1 


Given q - 2 + v 2 , find lim q. To take the left-side rim it, we substitute the series of negative 
vafues ”1/ -jQt • 0 n that order) for vand find that (2 + v 2 ) will decrease steadily 

and approach 2 (because v 2 will gradually approach 0). Next, for the right-side limit, we 
substitute the series of positive values 1/ iV Ti' ( in that order) for vand find the same 
limit as before. Inasmuch as the two limits are identical, we consider the limit of q to exist 
and write lim q = 2, 


f This name is easily explained by the shape of the curve. But step functions can be expressed 
algebraically, too. The one illustrated in Fig. 6.2c can be expressed by the equation 

It (for 0 < v < N) 

^ Li (for N < v) 

Note that, in each subset of its domain as described, the function appears as a distinct constant 
function, which constitutes a "step" in the graph. 

in economics, step functions can be used, for instance, to show the various prices charged for 
different quantities purchased (the Curve shown in Fig. 6.2c pictures quantity discount) or the various 
tax rates applicable to different income brackets. 
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It is tempting to regard the answer obtained in Example 1 as the outcome of setting 
i' = 0 in the equation q = 2 + v 1 , but this temptation should in genera! be resisted. In eval¬ 
uating lim q , we only let u tend to N, but, as a rule, do not let it = N. Indeed, we can quite 

V—*,V ^ t 

legitimately speak of the limit of</ as v N, even if TV" is not in the domain of the Junction 
q = g(v). In this latter case, if we try to set v — N, q will clearly be undefined. 


Example 2 


Given o = (1 - v 2 )/(1 - v), find lim q. Here, N = 1 is not in the domain of the function, 

and we cannot set v = 1 because that would involve division by zero. Moreover, even the 
limit-evaluation procedure of letting v-> 1, as used in Example 1, will cause difficulty, for 
the denominator (1 - v) will approach zero when v ->■ 1, and we will still have no way of 
performing the division in the limit. 

One way out of this difficulty is to try to transform the given ratio to a form in which v 
will not appear in the denominator. Since v-*- 1 implies that 1, so that (1 - v) is 
nonzero, it is legitimate to divide the expression (1 - v 2 ) by (1 - v), and write* 

1 - v 2 

q=- - =l+v (v^ 1) 

1 - v 

In this new expression for q, there is no longer a denominator with vin it. Since (1 + v) -* 2 
as v- —v 1 from either side, we may then conclude that lim q = 2. 


Example 3 


Given q = (2v+ 5)/(v +1), find iim q. The variable v again appears in bof/i the numerator 

and the denominator. If we let v —^ +oo in both, the result will be a ratio between two infi¬ 
nitely large numbers, which does not have a clear meaning. To get out of the difficulty, we 
try this time to transform the given ratio to a form in which the variable vwill not appear in 
the numerator.* This, again, can be accomplished by dividing out the given ratio. Since 
(2v+ 5) is not evenly divisible by (v + 1), however, the result will contain a remainder term 
as follows: 


q 


2v+ 5 
v+1 


= 2 + 


3 


But, at any rate, this new expression for q no longer has a numerator with v in it. Noting 

that the remainder 3/(v+1)-» 0 as v-»+no, we can then conclude that lim q = 2. 

' ' ' w>+«. 


There also exist several useful theorems on the evaluation of limits. These will be 
discussed in Sec. 6.6. 


* The division can be performed, as in the case of numbers, in the following manner: 

^ + v 

1 - vf] ~ 

1 - V 

V— V 2 
v— v* 


Alternatively, we may resort to factoring as follows: 

_ (1 + v)(1 - v) 

~v~ 1 - V 


= 1 + V 


<^# 1 ) 


* Note that, unlike the v 0 case, where we want to take vout of the denominator in order to 
avoid division by zero, the v co case is better served by taking v out of the numerator. As v * sc, 
an expression containing v in the numerator will become infinite but an expression with v in the 
denominator will, more conveniently for us, approach zero and quietly vanish from the scene. 
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Formal View of the Limit Concept 

The previous discussion should have conveyed some general ideas about the limit concept. 
Let us now give it a more precise definition. Since such a definition will make use of the 
concept of neighborhood of a point on a line (in particular, a specific number as a point on 
the line of real numbers), we shall first explain, the latter term. 

For a given number L, there can always be found a number (L - a\) < L and another 
number {L -\-a 2 ) > L, where a 1 and a 2 arc some arbitrary positive numbers. The set of 
all numbers falling between (L — a \) and (L + a 2 ) is called the interval between those two 
numbers. If the numbers (/. - a\) and (L + a 2 ) are included in the set, the set is a closed 
interval ; if they are excluded, the set is an open interval. A closed interval between 
(L — ai) and (L -M 2 ) is denoted by the bracketed expression 

[L -a lt L +u 2 ] = {q I l- -ay < q < L -j-n 2 | 
and the corresponding open interval is denoted with parentheses: 

(L-a\,L+a 2 ) = {q \ L - « £ < q < L +a 2 ) (6.4) 

Thus, [ ] relate to the weak inequality sign 5 , whereas ( ) relate to the strict inequality sign 
<. But in both types of intervals, the smaller number (L - a\) is always listed first. Later 
on, we shall also have occasion to refer to half-open and half-dosed intervals such as (3,5] 
and [6, 00 ), which have the following meanings: 

(3, 5] = [x J 3 < x < 5} [6, co) = {x | 6 < x < oo[ 

Now we may define a neighborhood of L to be an open interval as defined in (6.4), 
which is an interval “covering” the number L. f Depending on the magnitudes of the arbi¬ 
trary numbers a\ and «?, it is possible to construct various neighborhoods for the given 
number /,. Using the concept of neighborhood, the limit of a function may then be defined 
as follows: 

As i 1 approaches a number N, the limit ofy = g{ u) is the number L, if, for every 
neighborhood of L that can be chosen, however small, there can be found a corresponding 
neighborhood of N (excluding the point v = N) in the domain of the function such that, for 
every value of v in that A-neighborhood, its image lies in the chosen /.-neighborhood. 

This statement can be clarified with the help of Fig, 6.3, which resembles Fig. 6.2a. 
From what was learned about Fig. 6.2a. we know that lim q = I. in Fig. 6.3. Let us show 

that L docs indeed fulfill the new definition of a limit. As the first step, select an arbitrary 
small neighborhood of L , say, (L - a], L -F a 2 ). (This should have been made even 
smaller, but we are keeping it relatively large to facilitate exposition.) Now construct a 
neighborhood of N, say, (<V — b\, N + b 2 ), such that the two neighborhoods (when ex¬ 
tended into quadrant I) will together define a rectangle (shaded in diagram) with two of its 
corners lying on the given curve. It can then be verified that, for every value of u in this 
neighborhood of N (not counting u = AO, the corresponding value oft/ = g(u) lies in the 

1 The identification of an open interval as the neighborhood of a point is valid only when we 
are considering a point or a line (one-dimensional space). In the case of a point in a plane 
(two-dimensional space), its neighborhood must be thought of as an area, say, a circular area 
that includes the point 
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FIGURE 6.5 « 
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chosen neighborhood of L. In fact, no matter how small an /.-neighborhood wc choose, a 
(correspondingly small) ^-neighborhood can be found with the property just cited. Thus L 
fulfills the definition of a limit, as was to be demonstrated. 

We can also apply the given definition to the step function of Tig. 6.2c in order to show 
that neither L \ nor qualifies as lim q . If wc choose a very smal I neighborhood of l \— 

t—* A' 

say, just a hair’s width on each side of L\— then, no matter what neighborhood wc pick for 
,V, the rectangle associated with the two neighborhoods cannot possibly enclose the lower 
step of the function. Consequently, for any value of v > N, the corresponding value of q 
(located on ihe lower step) will not be in the neighborhood of L [, and thus L\ fails the test 
for a limit. By similar reasoning, Li must also be dismissed as a candidate tor lim q. In 
fact, in this case no limit exists for q as n /V. 

The fulfillment of the definition can also be checked algebraically rather than by graph. 
For instance, consider again the function 

= 1 + l ' t 6 ’ 5 ) 

1 — V 

It has been found in Example 2 that lim q = 2; thus, here we have N = I and L = 2. To 

T* > 1 

verify that L = 2 is indeed the limit of q, we must demonstrate that, for every chosen 
neighborhood of L, (2 - a\, 2 + i^)- there exists a neighborhood of jV, (1 - hi, 1 + h), 
such that, whenever c is in this neighborhood of N, q must be in the chosen neighborhood 
of l. This means essentially that, for given values of a, and ai, however small, two num¬ 
bers b [ and bi must be found such that, whenever the inequality 

1 - h\ < u < 1 + bi (v ^ 1) (6.6) 

is satisfied, another inequality of the form 

2 — a | < q <2 + ax 



(6.7) 
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must also be satisfied. To find such a pair of numbers b\ and h 2 , let us first rewrite {6.7) by 
substituting (6.5): 

2 - a\ < \ + v < 2 + cti (6.7') 

This, in turn, can be transformed (by subtracting 1 from each side) into the inequality 

1 - <zj < v < 1 + a 2 (6.7”) 

A comparison of (6.7")—a variant of (6.7) with (6.6) suggests that if we choose the two 
numbers b\ and hi to be b\ = a\ and b 2 = a 2 , the two inequalities (6.6) and (6.7) will 
always be satisfied simultaneously. Thus the neighborhood of N. (1 - hi, I + h 2 ), a> 
required in the definition of a limit, can indeed be found for the case of /. = 2. and this 
establishes L = 2 as the limit. 

Let us now utilize the definition of a limit in the opposite way, to show that another value 
(say, 3) cannot qualify as lim q for the function in (6.5). If 3 were that limit, it would have 

V—► l 

to be true that, for every chosen neighborhood of 3, (3 - <j], 3 + a 2 ), there exists a neigh¬ 
borhood of 1, (1 — b], 1 + 6 2 ), such that, whenever u is in the latter neighborhood, q must 
be in the former neighborhood. That is, whenever the inequality 

1 — b\ < v < 1 T fa 

is satisfied, another inequality of the form 

3 — 1 < 1 4- v < 3 + a 2 

or 2 — a) < v < 2 + 112 

must also be satisfied. The only way to achieve this result is to choose b] = tt\ -- I and 
bi = di + 1. This would imply that the neighborhood of 1 is to be (he open interval 
(2 - r/i, 2 + a 2 ) . According to the definition of a limit, however, a\ and u 2 can be made 
arbitrarily small, say, a\ = a 2 = 0.1. In that case, the last-mentioned interval will turn out 
to be (1,9,2,1) which lies entirely to the right of the point t 1 = 1 on the horizontal axis and, 
hence, does not even qualify as a neighborhood of 1. Thus the definition of a limit cannot 
be satisfied by the number 3. A similar procedure can be employed lo show that am- num¬ 
ber other than 2 will contradict the definition of a limit in the present case. 

In general if one number satisfies the definition of a limit of q as v -* A', then no other 
number can. If a limit exists, it is unique. 


EXERCISE 6.4 

1. Given the function q = (v 2 + v-S6)f(v- 7), 7), find the left-side limit and the 

right-side lim it of q as vapproaches 7. Can we conclude from these answers that q has 
a limit as v approaches 7? 

2. Given q = [(v + 2) 5 - 8]/v, (v ^ 0), find: 

( 0 ) lima ( b ) lima (c) lima 

V-»0 V-*2 ' v-*a 

3. Given q = 5 -1 /v, (v ^ 0), find; 

(a) lim q (b) lim q 

V-** +00 tA-* -00 

4. Use Fig. 6,3 to show that we cannot consider the number (f. + a:) as the limit of q as v 
tends to N. 
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6.5 Digression on Inequalities and Absolute Values _ 

We have encountered inequality signs many times before. In the discussion of Sec. 6.4, we 
also applied mathematical operations to inequalities. In transforming (6.7') into (6.7"), for 
example, we subtracted 1 from each side of the inequality. What rules of operations are 
generally applicable to inequalities (as opposed to equations)? 

Rules of Inequalities 

To begin with, let us state an important property of inequalities: inequalities are transitive. 
This means that, if a > b and if b > c, then a > c. Since equalities (equations) arc also 
transitive, the transitivity property should apply to “weak” inequalities (> or <) as well as 
to “strict" ones (> or <), Thus we have 

a > b,b > c ^ a > c 
a >b,b >c => a >c 

This property is what makes possible the writing of a continued inequality, such as 
3 < a < 6 < 8 or 7 < n < 24. (In writing a continued inequality, the inequality signs are 
as a rule arranged in the same direction, usually with the smallest number on the left.) 

The most important rules of inequalities are those governing the addition (subtraction) 
of a number to (from) an inequality, the multiplication or division of an inequality by a 
number, and the squaring of an inequality. Specifically, these rules are as follows. 

Rule I (addition and subtraction) a > b =$ a ± k > b ± A 

An inequality will continue to hold if an equal quantity is added to or subtracted from each 
side. This rule may be generalized thus: If a > b > c, then a±k>b±k>c±k. 

Rule 11 (multiplication and division) 

| ka > kb (k > 0) 

a > ^ I ka < kb (k < 0) 

The multiplication of both sides by & positive number preserves the inequality, but a nega¬ 
tive multiplier will cause the sense (or direction ) of the inequality to be reversed. 

Example 1 S' nce ^ > 5 < multiplication by 3 will yield 3(6) > 3(5), or 18 > 1 5; but multiplication by -3 
--- will result in (-3)6 < (-3)5, or -18 < -15. 

Division of an inequality by a number n is equivalent to multiplication by the number 
1 /»; therefore the rule on division is subsumed under the rule on multiplication. 

Rule III (squaring) a > b,(b> 0) =$• a 2 > b 2 

If its two sides are both nonnegativc, the inequality will continue to hold when both sides 
are squared. 

p | 3 Since 4 > 3 and Since both sides are positive, we have 4 2 > 3 2 , or 16 > 9. Similarly, since 
exam p le i 2 > 0, it follows that 2 2 > 0 2 , or 4 > 0. 

Rules 1 through III have been stated in terms of strict inequalities, but their validity is 
unaffected if the > signs are replaced by > signs. 


Chapter 6 Comparative Statics and the Concept of Derivative 137 


Absolute Values and Inequalities 

Whenihe domain of a variable is an open interval (a, b }, the domain may be denoted by 
the set {x \u < x < b\ or. more simply, by the inequality a < x < b. Similarly, if it is a 
closed interval [«, b], it may be expressed by the weak inequality a < x < b. In the special 
case of an interval of the form (-a, a) —say, (-10,10) it may be represented either by 
the inequality-10 < x < 10 or, alternatively, by the inequality 

\x J < 10 

where the symbol \x | denotes the absolute value (or numerical value) of a. 

For any real number n, the absolute value of n is defined as follows:* 

n (ifw > 0) 

|n|= —n (iftf<0) (6.8) 

0 (if n = 0) 

Note that, if « = I 5, then 115| = 15; but ifn = —15, we find 

|-15| = -(-15) = 15 

also. In effect, therefore, the absolute value of any real number is simply its numerical value 
after the sign is removed. For this reason, we always have |« = |-«|, The absolute value 
of n is also called the modulus of n, 

Given the expression |x| = 10, we may conclude from (6.8) that x must be either 
10 or -10. By the same token, the expression [a| < 10 means that (1) if a > 0, then 
x = |x| < 10, so that x must be less than 10; but also (2) if a < 0, then aecording to (6.8) 
we have — x = |x| < 10. or x > -10, so that x must be greater than —10. Hence, by com¬ 
bining the two parts of this result, we see that x must lie within the open interval (-10,10). 
In general, we can write 


|.v| < n —n < x < n (« > 0) (6.9) 

which can also be extended to weak inequalities as follows: 

]x\ < n & —n < x < n (n > 0) (6.10) 

Because they are themselves numbers, the absolute values of two numbers m and n 
can be added, subtracted, multiplied, and divided. The following properties characterise 
absolute values: 


\m\ -t- |n| > \m -j- n\ 

\m | ■ |n| = \m ■ n\ 

| m | m 

|«| n 

The first of these, interestingly, involves an inequality rather than an equation. The reason 
for this is easily seen: whereas the left-hand expression fmj -I- f/j | is definitely a sum of two 

' We caution again that, although the absolute-value notation is similar to that of a first-order 
determinant, these two concepts are entirely different. The definition of a first-order determinant is 
|o,,l = o,, ( regardless of the sign of a,-;. In the definition of the absolute value \o\, on the other hand, 
the sign of n will make a difference. The context of the discussion should normally make it dear 
whether an absolute value or a first-order determinant is under consideration, 
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Example 3 


Example 4 


Example 5 


Example 6 


numerical values (both taken as positive), the expression |m +n\ is the numerical value of 
either a sum (if m and n are, say, both positive) or a difference (if m and n have opposite 
signs). Thus the left side may exceed the right side. 

If m = 5 and n = 3, then |m| + |a| = | m+ n\ = 8. But if m= 5 and n = -3, then |m| + |n| = 
5 + 3 = 8, whereas 

|m + n| = |5 - 3| = 2 

is a smaller number. 

In the other two properties, on the other hand, it makes no difference whether m and n 
have identical or opposite signs, since, in taking the absolute value of the product or 
quotient on the right-hand side, the sign of the latter term will be removed in any case. 

If m= 7 and n = 8, then |m| • |n| = |m• n| = 7(8) = 56. But even if m=-7 and n = 8 
(opposite signs), we still get the same result from 

l m l 1 M = |-7| ■ |8| = 7(8) = 56 
and |mn| = [-7(8)1 = 7(8) = 56 

Solution of an Inequality 

Like an equation, an inequality containing a variable (say, x) may have a solution; the solu¬ 
tion, if it exists, is a set of values of .r which make the inequality a true statement. Such a 
solution will itself usually be in the form of an inequality. 

Find the solution of the inequality 

3;c-3>;c + 1 

As in solving an equation, the variable terms should first be collected on one side of the 
inequality. By adding (3 - x) to both sides, we obtain 

3x-3 + 3-x>x + 1+ 3- x 

or 2x> 4 

Multiplying both sides by \ (which does not reverse the sense of the inequality, because 
\ > 0) will then yield the solution 

X > 2 

which is itself an inequality. This solution is not a single number, but a set of numbers. 
Therefore we may also express the solution as the set |x | x > 2) or as the open interval 

(2, oc). 

Solve the inequality |1 - x| < 3. First, let us get rid of the absolute-value notation by utiliz¬ 
ing (6.10). The given inequality is equivalent to the statement that 

-3 < 1 - x < 3 


or, after subtracting 1 from each side. 
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Multiplying each side by (-1X we then get 

4 > x > —2 

where the sense of inequality has been duly reversed. Writing the smaller number first, we 
may express die solution in the form of the inequality 

-2 < x < 4 

or in the form of the set {x I -2 < x < 4) or the closed interval [-2,4], 

Sometimes, a problem may call for the satisfaction ofsevcral inequalities in several vari¬ 
ables simultaneously; then we must solve a system of simultaneous inequalities. Th is prob¬ 
lem arises, for example, in nonlinear programming, which will be discussed in Chap. 13. 


EXERCISE 6.5 

1. Solve the following inequalities: 

(a) 3x - 1 < 7x + 2 (c) 5x -f 1 < x -f 3 

(b) 2x + 5 < x - 4 (d)2x-1 <*x + 5 

2. If 8x - 3 < 0 and 8x > 0, express these in a continued inequality and find its solution. 

3. Solve the following: 

(o) \x + 11 < 6 (b) |4 - 3x| < 2 (c)|2x+3|<5 


6.6 Limit Theorems 


Our interest in rates of change led us to the consideration of the concept of derivative, 
which, being in the nature of the limit of a difference quotient, in turn prompted us to study 
questions of the existence and evaluation of a limit. The basic process of limit evaluation, 
as illustrated in Sec. 6.4. involves letting the variable v approach a particular number 
(say, N) and observing the value that <y approaches. When actually evaluating the limit of a 
function, however, wo may draw upon certain established limit theorems, which can mate¬ 
rially simplify the task, especially ("or complicated functions. 

Theorems Involving a Single Function 

When a single function q — g(v) is involved, the following theorems are applicable. 
Theorem I If q = uv -+■ b, then lim q = a i\' + h {a and b are constants). 

Example 1 Given q = 5v + 7, we have lim q = 5(2) + 7 = 17. Similarly, lim q = 5(0)+ 7 = 7. 

Theorem II If q = g{v) = hen lim q = h. 

b- ,v 

This theorem, which says that the limit of a constant function is the constant in that func¬ 
tion. is merely a special case of Theorem I. with a = 0. (You have already encountered an 
example of this case in Exercise 6.2-3.) 

Theorem III If 4 = i 5 then lim q = N , 

1 '- k ;V , 

II q = ir, then lim q = 

u-kV 
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Example 2 


Example 3 


Given q = v 3 , we have lim q = (2) 3 = 8. 

V-*2 

You may have noted that, in Theorems I through 111, what is done to find the limit of q 
as u —* A' is indeed to let v = ,V. But these are special cases, and they do not vitiate the 
general rule that “v N” does not mean “u = N” 

Theorems Involving Two Functions 

If we have two functions of the same independent variable i>. q\ = g(i') and qi = h(v), and 
if both functions possess limits as follow s: 

lim q\ = L\ lim qi = Li 

i. 1 —► V v >N 

where L\ and L 2 are two finite numbers, the following theorems are applicable. 

Theorem IV (sum-difference limit theorem) 

lim {q\ ±q 2 } = L\±Li 

<'—»;V 

The limit of a sum (difference) of two functions is the sum (difference) of their respective 
limits. 

In particular, we note that 

lim 2q\ = lim {q i + q \) = L \ + i \ =2L] 
which is in line with Theorem I. 

Theorem V (product limit theorem) 

lim(<M 2 ) = L\Li 

i— .'V 

The limit of a product of two functions is the product of their limits. 

Applied to the square of a function, this gives 

lim(<?i<7i) = L\L\ = L\ 
which is in tine with Theorem III. 

Theorem VI (quotient limit theorem) 

lim — = —^ {L 2 0) 

Li 

The Limit of a quotient of two functions is the quotient of their limits. Naturally, the limit 
1 2 is restricted to be nonzero; otherwise the quotient is undefined. 

Find lim(l + v)/{2+ v). Since we have here lim(1 + v) = 1 and lim(2 + v) = 2, the desired 

V >0 y -*0 Y -*0 

limit is 2 - 

Remember that L\ and L 2 represent finite numbers; otherwise these theorems do not 
apply. In the case of Theorem VI. furthermore, L 2 must be nonzero as well. If these re¬ 
strictions are not satisfied, we must fall back on the method of limit evaluation illustrated 
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in Examples 2 and 3 in Sec. 6.4, which relate to the cases, respectively, of I: being ycro 
and of A 2 being infinite. 

Limit of a Polynomial Function 

With the given limit theorems at our disposal, we can easily evaluate the limit of any poly¬ 
nomial function 

<{ = g(v) - a u + a1 t + a 2 v 2 + ■■■ + <!„v" ( 6 .11) 

as v tends to the number <Y Since the limits of the separate terms are, respectively, 
lim an = a 0 limaiu=«|,Y lim a 2 v 2 = a 2 N 2 (etc.] 

v —* V t ; —♦ A* i-f ,V 

the limit of the polynomial function is (by the sum limit theorem) 

litn q = at) + fliiVH- ci 2 N z 4- ■ • • +a n N" (6.12) 

This limit is also, wc note, actually equal togfiY), that is, equal to the value of the function 
in (6 .11) when v = N. This particular result will prove important in discussing the concept 
of continuity of the polynomial function. 

EXERCISE 6,6 

1. Find the limits of the function q — 7 - 9v+v 2 : 

(a) As v -* 0 (b) As v 3 

2. Find the limits of q = (v + 2)(v- 3): 

(a) As v -1 (b) As v-+ 0 

3. Find the limits of q = (3v+ 5)/(v + 2): 

(o) As v* —*■ 0 (b) As v -» 5 

6.7 Continuity and Differentiability of a Function _ 

The preceding discussion of the concept of limit and its evaluation can now be used to 
define the continuity and differentiability of a function. These notions bear directly on the 
derivative of the function, which is what interests us. 

Continuity of a Function 

When a function q = g(u) possesses a limit as v tends to the point N in the domain, and 
when this limit is also equal iog(N )—that is, equal to the value of the function at r — ft’— 
the function is said to be continuous at N. As defined here, the term continuin' involves no 
less than three requirements: (1) the point N must be in the domain of the function: i.e., 
g(N) is defined; (2) the function must have a limit as v A': i.e., lim #(u) exists; and 
(3) that limit must be equal in value tog(jY); i.e., lim g(v) = g{N). 11 ,A 

i-* V 

It is important to note that while the point (A'. L) was excluded from consideration in 
discussing the limit of the curve in Fig. 6.3, we arc no longer excluding it in the present 
context. Rather, as the third requirement specifically states, the point (A', /,] must be on the 
graph of the function before the function can be considered as continuous at point N, 


(c) As -1 
(c) As v 5 
(c) As v-+ -1 
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Let us check whether the functions shown in Fig. 6.2 are continuous. In diagram a, all 
three requirements are met at point N. Point N is in the domain; q has the limit /, as u -*■ N\ 
and the limit L happens also to be the value of the function at N. Thus, the function repre¬ 
sented by that curve is continuous at ,Y. The same is true of the function depicted in 
Fig. 6.2 b. since L is the limit of the function as r approaches the value A' in the domain, and 
since L is also the value of the function at N. This last graphic example should suffice to es¬ 
tablish that the continuity of a function at point N does not necessarily imply that the graph 
of the function is “smooth" at v = N, for the point (N, L) in Fig. 6.2b is actually a “sharp” 
point and yet the function is continuous at that value of v. 

When a function q = g( u) is continuous at all values of t; in the interval {a, b). it is said 
to be continuous in that interval. If the function is continuous at all points in a subset $ of 
the domain (where the subset S may be the union of several disjoint intervals), it is said to 
be continuous in 5. And, finally, if the function is continuous at all points in its domain, wc 
say that it is continuous in its domain. Even in this latter case, howeven the graph of the 
function may nevertheless show a discontinuity (a gap) at some value ofu. say, at i; - 5, if 
that value of v is not in its domain. 

Again referring to Fig. 6.2, we see that in diagram c the function is discontinuous at N 
because a limit does not exist at that point, in violation of the second requirement of conti¬ 
nuity. Nevertheless, the function does satisfy the requirements of continuity in the interval 
(0, N ) of the domain, as well as in the interval [N, oc). Diagram d obviously is also dis¬ 
continuous at = N. This time, discontinuity emanates from the fact that M is excluded 
from the domain, in violation of the first requirement of continuity. 

On the basis of the graphs in Fig. 6.2, it appears that sharp points are consistent with 
continuity, as in diagram b, but that gaps are taboo, as in diagrams c and d. This is indeed 
the case. Roughly speaking, therefore, a function that is continuous in a particular interval 
is one whose graph can be drawn for the said interval without lifting the pencil or pen from 
the paper—a feat which is possible even if there are sharp points, but impossible when gaps 
occut. 

Polynomial and Rational Functions 

Let us now consider the continuity of certain frequently encountered functions. For any 
polynomial function, such as q — g(u) in (6.11), we have found from (6.12) that lim q 

exists and is equal to the value of the function at N. Since N is a point (any point) in the 
domain of the function, we can conclude that any polynomial function is continuous in its 
domain. This is a very useful piece of information, because polynomial functions will be 
encountered very often. 

What about rational functions? Regarding continuity, there exists an interesting theorem 
(the continuity theorem) which states that the sum, difference, product, and quotient of any 
finite number of functions that are continuous in the domain are, respectively, also contin¬ 
uous in the domain. As a result, any rational function (a quotient of two polynomial func¬ 
tions) must also be continuous in its domain. 


Example 1 


The rational function 


4v 2 
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is defined for all finite real numbers; thus its domain consists of the interval (-co, ou). For 
any number N in the domain, the limit of q is (by the quotient limit theorem) 


lim (4v 2 ) 

iim q =-=- 

y->n^ lim(v 2 -hi) 


v- > N 


4N 2 
N 2 + 1 


which is equal to g(N). Thus the three requirements of continuity are all met at N. More¬ 
over, we note that N can represent any point in the domain of this function; consequently, 
this function is continuous in its domain. 


Example 2 


The rational function 


Q = 


v i 4 - v 2 - 4v - 4 


is not defined at v = 2 and at v = -2. Since those two values of v are not in the domain, the 
function is discontinuous at v= -2 and v = 2, despite the fact that a limit of q exists as 
-2 or 2. Graphically, this function will display a gap at each of these two values of v. 
But for other values of v (those which are in the domain), this function is continuous. 


Differentiability of a Function 

The previous discussion has provided us with the tools for ascertaining whether any func¬ 
tion has a limit as its independent variable approaches some specific value. Thus we can try 
to take the limit of any function y = f(x) as x approaches some chosen value, say. x 0 . 
However, we can also apply the “limit’' concept at a different level and take the limit of the 
difference quotient of that function, Ay/Ax, as Ax approaches zero. The outcomes of 
limit-taking at these two different levels relate to two different, though related, properties 
of the function/ 

Taking the limit of the function y = f{x) itself, we can, in line with the discussion of 
the preceding subsection, examine whether the function / is continuous at x — x ( >. The con¬ 
ditions for continuity are(l)x = x () must be in the domain of the function/ (2)y must have 
a limit as x —► xq, and (3) the said limit must be equal to f{x o). When these are satisfied, 
we can write 


lim/(x) =/(.v 0 ) [continuity condition] (6.13) 

In contrast, when the “limit" concept is applied to the difference quotient Ay/Ax as 
Ax -*• 0, wc deal instead with the question of whether the function/is differentiable at 
x = Xf,, i.c., whether the derivative dyjdx exists at x = x 0 , or whether fix o) exists. The 
term differentiable is used here because the process of obtaining the derivative dy/dx is 
known as differentiation (also called derivation). Since /'(.to) exists if and only if the limit 
of Ay/Ax exists at x = xu as Ax 0, the symbolic expression of the differentiability of 
/is 

/Vo) = >im ~ 

A.w<) Ax 

lim + Aj 0 ~ .Vo) 

Ax 


[differentiability condition] (6.14) 
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These two properties, continuity and differentiability, are very intimately related to each 
other- the continuity of/is a necessary condition for its differentiability (although, as we 
shafi see later, this condition is not sufficient). What this means is that, to be differentiable 
atx = x (| , the function must first pass the test of being continuous at x = x n . To prove this, 
wc shall demonstrate that, given a function >■ = f(x), its continuity atx = x 0 follows from 
its differentiability at x = xo; i.e., condition (6.13) follows from condition (6.14). Before 
doing this, however, let us simplify the notation somewhat by (1) replacing xq with the 
symbol N and (2) replacing (x 0 + Ax) with the symbol x. The latter is justifiable because 
the postchange value of x can be any number (depending on the magnitude of the change) 
and hence is a variable denotable by x. The equivalence of the two notation systems is 
shown in Fig. 6.4, where the old notations appear (in brackets) alongside the new. Note that, 
with the notational change, Ax now becomes (x - /V), so that the expression “Ax -+ 0” 
becomes “x -* (V,” which is analogous to the expression i> -*■ N used before in connection 
with the function q = g(n). Accordingly. (6.13) and (6.14) can now be rewritten, respec¬ 
tively, as 


lint, f{x) = /(AO 

X-* A' 


r(N)=m mzm 

x-N 


( 6 - 13 ') 

( 6.140 


What we want to show is, therefore, that the continuity condition (6.13') follows from 
the differentiability condition (6.14'). First, since the notation x -»• A implies that x ± A, 
so that x - A is a nonzero number, it is permissible to write the following identity; 

fix) - f(N) -- - -V) ( 6 . 15 ) 

x - N 



l*©l [*(>+ A.tl 
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Taking the limit of each side of (6.15) as .v -v N yields the following results: 

Left side = lim /'(.*)- I ini f(N) [difference limit theorem] 

-> ,'V j-f.V 

= lim j(x ) - f{N) [/(jV) is a constant] 

T-'iV 

Right side = iim \im(x-N) [product limit theorem] 

*-*•* x-N y->N 

= /'(/V)( lim x - lim N) [by (6.14') and difference limit theorem] 

X * jV X ,V 

= f(N)(N-N) = 0 

Note that wc could not have written these results, if condition (6.14') had not been granted, 
for if f f (N) did not exist, then the right-side expression (and hence also the left-side 
expression) in (6.15) would not possess a limit. If /''(<¥) does exist, however, the two sides 
will have limits as shown in the previous equations. Moreover, when the left-side result and 
the right-side result are equated, we get lim f\x) - f(N) = 0 , which is identical with 

.r-*.V 

(6.13 ). Thus we have proved that continuity, as shown in (6.13'), follows from differentia- 
bility, as shown in (6.14'). In general, if a function is differentiable at every point in its 
domain, we may conclude that it must be continuous in its domain. 

Although differentiability implies continuity, the converse is not true. That is, continu¬ 
ity is a necessary, but not a sufficient, condition for differentiability. To demonstrate this, 
we merely have to produce a counterexample. Let us consider the functi on 

y — f(x) — |* — 2| + 1 (6.16) 

which is graphed in Fig, 6.5. As can be readily shown, this function is not differentiable, 
though continuous, when x = 2. That the function is continuous at.r = 2 is easy to estab¬ 
lish. First, x = 2 is in the domain of the function. Second, the limit of v exists as .r tends 
to 2; to be specific, lim y = lim y = 1. Third, /(2) is also found to be 1. Thus all three 

X—>2 X-*2 

requirements of continuity are met. To show that the function /is not differentiable at 
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x = 2 , wc must show that the limit of the difference quotient 


lk, = llm I- 1 - 2 ' + 1 - 

x — 2 X — 2 -i — 2 x — 2 


— lim 

.,->2 


x-2 
x -2 


does not exist. This involves the demonstration of a disparity between the left-side and the 
right-side limits. Since, in considering the right-side limit, x must exceed 2, according to the 
definition of absolute value in ( 6 . 8 ) we have |x - 2| = x - 2. Thus the right-side limit is 


lim 

j — 2 1 


x-2\ 


jc — 2 


x 2 

lim -= lim 1 = 1 

x—2 j . — 2 ~ 


On the other hand, in considering the left-side limit, x must be less than 2: thus, according 
to (6.8), |x — 2| — —(x - 2). Consequently, the left-side limit is 

lim - — ^ = lim ^ - - lim (-1) = -1 

,v->2- x — 2 x- 2- X—2 x—2 


which is different from the right-side limit. This shows that continuity does not guarantee 
differentiability. In sum, all differentiable functions arc continuous, but not all continuous 
functions are differentiable. 

In Fig. 6.5, the nondifferentiability of the function at x = 2 is manifest in the fact that 
the point ( 2 , 1 ) has no tangent line defined, and hence no definite slope can be assigned to 
the point. Specifically, to the left of that point, the curve has a slope of -1. but to the right 
it has a slope of -hi. and the slopes on the two sides display no tendency to approach a 
common magnitude at x = 2. The point (2, I) is, of course, a special point; it is the only 
sharp point on the curve. At other points on the curve, the derivative is defined and the 
function is differentiable. More specifically, the function in (6.16) can be divided into two 
linear functions as fol lows: 

Left part; y = -(x - 2) + 1 = 3 - x (x < 2) 

Right part: y = (x - 2) + 1 = x - 1 (x > 2) 

The left part is differentiable in the interval (—co, 2), and the right part is differentiable in 
the interval ( 2 , co) in the domain. 

In general, differentiability is a more restrictive condition than continuity, because it re¬ 
quires something beyond continuity. Continuity at a point only rules out the presence of a 
gap. whereas differentiability rules out “sharpness” as well. Therefore, differentiability 
calls for “smoothness” of the function (curve) as well as its continuity. Most of the. 
functions employed in economics have the property that they are differentiable everywhere. 
When general functions are used, moreover, they arc often assumed to be everywhere 
differentiable, as we shall in the subsequent discussion. 


EXERCISE 6.7 

1. A function y - f(x) is discontinuous at x = x 0 when any of the three requirements for 
continuity is violated atx = x 0 . Construct three graphs to illustrate the violation of each 
of those requirements. 
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2. Taking the set of all finite real numbers as the domain of the function q = g(v) = v 2 - 
5v-2; 

(a) Find the limit of q as v tends to N (a finite real number). 

( b ) Check whether this limit is equal to g(N). 

(c) Check whether the function is continuous at N and continuous in its domain. 

v+ 2 

3. Given the function q = g(v) = —-: 

H V 2j r2 

(a) Use the limit theorems to find lim a, N beinq a finite real number. 

(b) Check whether this limit is equal to g(N). 

(C) Check the continuity of the function g(v) at N and in its domain (-cc, oo). 

a r - tr \ _9x + 20 

4. Given y = fCx) = --- : 

x -4 

(c) Is it possible to apply the quotient limit theorem to find the limit of this function as 
x-*4? 

(b) Is this function continuous at x = 4? Why? 

(c) Find a function which, for x £ 4, is equivalent to the given function, and obtain 
from the equivalent function the limit of y as x -* 4. 

5. In the rational function in Example 2, the numerator is evenly divisible by the denomi¬ 
nator, and the quotient is v +1. Can we for that reason replace that function outright 
by q - v -j-' 1 ? Why gr why not? 


6. On die basis of the graphs of the six functions in Fig. 2.8, would you conclude that 
each such function is differentiable at every point in its domain? Explain. 



Chapter 


Rules of Differentiation 
and Their Use in 
Comparative Statics 


The central problem of comparative-static analysis, that of finding a rate of change, can be 
identified with the problem of finding the derivative of some function y — fix), provided 
only an infinitesimal change in x is being considered. Even though the derivative dy/dx is 
defined as the limit of the difference quotient q — g(v) as i; —> 0, it is by no means neces¬ 
sary to undertake the process of limit-taking each Time the derivative of a function is 
sought, for There exist various rules of differentiation (derivation) that will enable us to 
obtain the desired derivatives directly. Instead of going into comparative-static models 
immediately, therefore, let us begin by learning some rules of differentiation. 

7.1 Rules of Differentiation for a Function of One Variable 


First, let us discuss three rules that apply, respectively, to the following types of function of 
a single independent variable: y = k (constant function) and y = x” and y = ex" (power 
functions). All these have smooth, continuous graphs and are therefore differentiable 
everywhere. 

Constant-Function Rule 

The derivative of a constant function y = k, or /(.v) = k, is identically zero, i.e., is zero 
for all values of x. Symbolically, this rule may be stated as: Given y = f(x) = k, the 
derivative is 


dy dk 
dx dx 


or f'(x) = 0 


Alternatively, we may state the rule as: Given y — fix) = k, the derivative is 

y->' = y-/00 = ^-k=0 
dx dx dx 
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Example 1 


Example 2 


where the derivative symbol lias been separated into two parts, d/dx on the one hand, and 
y [or/(x) or A] on the other. The first part, d/dx, is an operator symbol, which instructs us 
to perform a particular mathematical operation. Just as the operator symbol J instructs 
us to take a square root, the symbol d/dx represents an instruction to take the derivative of, 
or to differentiate, (some function) with respect to the variable x. The function to be oper¬ 
ated on (to be differentiated) is indicated in the second part; here it is v = f\x) — k. 

The proof of the rule is as follows. Given fix) = k, we have /I N) = k for any value 
of N. Thus the value of /'( N )—the val ue of the derivative at x — A—as defined in (6.13) 
is 


f(N) = lim 
x^ ,-v 


/(v)-/(AQ 

x-N 


lim ^ 

.-.V x - N 


lim 0 = 0 

x — ,V 


Moreover, since ATcpresents any value of ,v at all, the result f{ N) — 0 can be immediately 
generalized to f'(x) — 0. This proves the rule. 

it is important to distinguish clearly between the statement f'{x) = 0 and the similar¬ 
looking but different statement /'(jc 0 ) = 0. By f'(x) = 0, we mean that the derivative 
function j' has a zero value for a/Z values of j; in writing ,/'(xo) = 0 , on the other hand, we 
are merely associating the zero value of the derivative with a particular value oft, namely, 
X =Xq. 

As discussed before, the derivative of a function has its geometric counterpart in 
the slope of the curve. The graph of a constant function, say. a fixed-cost function 
Cp = f(Q) = SI.200, is a horizontal straight line with a zero slope throughout. Corre¬ 
spondingly, the derivative must also be zero for all values of Q\ 


d 

dQ. 




0 


Power-Function Rule 

The derivative of a power function y = f(x) = x" is nx"~ ] . Symbolically, this is cx- 
pressed as 

~x n =nx n ~' or f(x) = nx" 1 (7.1) 


The derivative of y = x 3 is ^ = — x 3 = 3x 2 . 

ax ax 

The derivative of y = x 9 is — x 9 = 9x 8 . 

dx 

This rule is valid for any real-valued power of x; that is. the exponent can be any real 
number. But we shall prove it only for (he case where n is some positive integer. In the 
simplest case, that of n = 1. the function is f(x) = x. and according to the rule, the 
derivative is 

fix) = ~x = lu :i ) = 1 
dx 
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Example 3 


Example 4 


The proof of this result follows easily from the definition of f{N) in (6.14'). Given 
f(x) = x, the derivative value at any value ofx, say, x = N, is 


f'(N) = lim 


m - i\n) 

x-N 


x-N 

lim- 

\ x - N 


lim 1 = 1 
1-mV 


Since/Vrepresents any value ofx, it is permissible to write fix) = 1. This proves the rule 
for the case of n = 1. As the graphical counterpart of this result, wc see that the function 
v = f(x) = x plots as a 45° line, and it has a slope of 4-1 throughout. 

Por the cases of larger integers, n = 2, 3,..., let us first note the following identities: 

x 2 - N 2 

-= x + N [2 terms on the right] 

x-N 

-= x 2 + Nx + N 2 [3 terms on the right] 

x-N 


x n -N n 


x-N 


x n-l + Nx * 2 + N 2 x »-i + . . . + N n-\ 

[n terms on the right] 


(7.2) 


On the basis of (7.2), we can express the derivative of a power function f(x) = x” at 
x = N as follows: 


, fix) - f(N) x n - N* 

f(N) = lim J{ 1 = lim -— 

x—N x-N x-*v x - N 


= lim (x ' ,_1 -j- Nx"~ 6 H- h M' i_l ) 


n-2 




*-*■ V 


= lim x* + lim Nx 

x-»N wAr 


n-2 


lim N 
x -* V 


n 1 


= N n ~ { + N"-' + 


n-l 


N 


n-l 


[by (7.2)] 

[sum limit theorem] 
[a total of n terms] 


= nN” 1 


(7.3) 


Again, N is any value of x; thus this last result can be generalized to 

f(x) = nx”-' 

which proves the rule for«, any positive integer. 

As mentioned previously, this rule applies even when the exponent n in the power ex¬ 
pression x n is not a positive integer The following examples serve to illustrate its applica¬ 
tion to the latter cases. 


Find the derivative of y = x°. Applying (7.1), we find 

—x° = 0(x _1 ) = 0 
dx 


Find the derivative of y = 1 /x 3 . This involves the reciprocal of a power, but by rewriting the 
function as y = x -3 , we can again apply (7.1) to get the derivative: 



-3 
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Example 5 


Example 6 


Example 7 


Example 8 


Find the derivative of y = <fx. A square root is involved in this case, but since Jx = x l/2 , the 
derivative can be found as follows: 




1 _ -fx 

2 s [x 2x 


Derivatives arc themselves functions of the independent variable x. In Example 1, for 
instance, the derivative is dyjdx = 3a: 2 , or /'( x) = 3x 2 . so that a different value of a will 
result in a different value of the derivative, such as 

/'(l) = 3(1) 2 = 3 f'(2) = 3(2) 2 = 12 


These specific values of the derivative can be expressed alternatively as 


dy 

dx 


= 3 


X = 


dy_ 

dx 


= 12 


X=2 


hut the notations /'(l) and /'(2) are obviously preferable because of their simplicity. 

It is of the utmost importance to realize that, to find the derivative values /'(l). /'(2), 
etc., we must first differentiate the function/(x), to get the derivative function f\x), and 
then let x assume specific values in fix). To substitute specific values ofx into the primi¬ 
tive function f(x) priOTto differentiation is definitely not permissible. As an illustration, if 
we let x = 1 in the function of Example 1 before differentiation, the function will degen¬ 
erate into y — a — 1 —a constant function—which will yield a zero derivative rather than 
the correct answer of/'(x) = 3x 2 . 


Power-Function Rule Generalized 

When a multiplicative constant c appears in the power function, so that /(a ) = cx\ its 
derivative is ^ 

Cx n = cnx“- ] or f(x) = cnx H ~ l 

CIX 

This result shows that, in differentiating ex”, we can simply retain the multiplicative con¬ 
stant c intact and then differentiate the term x” according to (7.1). 

Given y - 2x, we have dy/dx = 2x° = 2. 


Given f(x) = 4x ? , the derivative is f'(x) = 12x 2 . 


The derivative of f(x) = 3x 2 is f‘(x) = -6x 3 . 


For a proof of this new rule, consider the fact that for any value ofx, say, x ~ N, the 
value of the derivative of f(x) = cx n is 

,• /W-/0V) ,. cx” - cN n (x n - A"' \ 

/ (iV) — lim-—-- hm-— hm c - 

.t-s-.v x — N x->,v x-N a —* a \ x — N ! 


= lim c lim 


x” - N” 


x ->iV .v x — N 
x n - N n 


— c lim 

x-*ki x-N 

= cnN”~ ] 


[product limit theorem] 

[limit of a constant] 
[from (7.3)] 
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In the view that /V is any value of ,v, this last result can be generalized immediately to 
f'(x) = tTi.v"' 1 . which proves the rule. 


EXERCISE 7.1 


Find the derivative of each 

of the following functions: 


(o) y = x u 

(c) y = 7x s 

(e) w= -4u'/ 2 

(b) y — 63 

Find the following: 

(d) w= 3u 1 

(0 w = 4u''' 4 

w 

(c > 

<*» 




Find f'(1)and f(2)from 

the following functions: 


(a) y= f(x) = 18x 

(c) f(x) = -5x- 2 

(e) f(V) = 6w 1/ * 

rr» 

3< 

o 

II 

W 

II 

>* 

s 

II 

^5 

(!) !(w) = -iw 


4. Graph a function f(x) that gives rise to the derivative function f'(x) = 0. Then graph a 
function g(x) characterized byg'(xo) = 0. 


7.2 Rules of Differentiation Involving 

Two or More Functions of the Same Variable 


The three rules presented in Sec. 7.1 arc each concerned with a single given function fix). 
Now suppose that wc have two differentiable functions of the same variable x. say, f(x ) and 
g(x). and w'e want to differentiate the sum. difference, product, or quotient formed with 
these two functions. In such circumstances, arc there appropriate rules that apply? More 
concretely, given two functions- say. f(x) = ix 1 and gtv) = 9„r'- how do we get the 
derivative of. say, 3x 2 -b9x 12 , or the derivative of(3x 2 )(9x 12 )? 

Sum-Difference Rule 

The derivative of a sum (difference) of two functions is the sum (difference) of the deriva¬ 
tives of the two functions: 

^-\f(x)±g(x)] = ^-Kx)±^-^{x) = f'(x)±g\x) 
d .v dx ax 

The proof of this again involves the application of the definition of a derivative and of the 
various limit theorems. We shall omit the proof and, instead, merely verify its validity and 
illustrate its application. 


Example 1 


From the function y= 14x\ we can obtain the derivative dy/dx = A2x 2 . But 14x 3 = 
5x J —9x 3 , so that y may be regarded as the sum of two functions f(x) = 5x 3 and 
g(x) = 9x 3 . According to the sum rule, we then have 

^ = — (5* 3 +- M - ^-5x 3 + -^-9x J = 15x 2 + 27x 2 = 42x z 
dx dx dx dx 

which is identical with our earlier result. 
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Example 2 


Example 3 


Example 4 


This rule, which we stated in terms of two functions, can easily be extended to more 
functions. Thus, it is also valid to write 

^[/W ±g(*) ± *U)] = /'(*) ±g'(x) ± h'(x) 

The function cited in Example 1, y = 14x 3 , can be written as y = 2x l + 13x 3 - x i . The 
derivative of the latter, according to the sum-difference rule, is 

^ + 13* 3 - X 3 ) = 6x 2 + 39x 2 - 3x 2 = 42x 2 

dx dx 

which again checks with the previous answer. 

This rule is of great practical importance. With it at our disposal, it is now possible to 
find the derivative of any polynomial function, since the latter is nothing but a sum of power 
functions. 


d 

—(or + bx + c) = 2 ox + b 
uK 

7/ + 2x* - 3* + 37) = 28x 3 + 6x 2 - 3 + 0 - 28x i + 6x 2 - 3 

Note that in Examples 3 and 4 the constants c and 37 do not really produce any effect on 
the derivative, because the derivative of a constant term is zero. In contrast to the multi¬ 
plicative constant, which is retained during differentiation, the additive constant drops 
out. This fact provides the mathematical explanation of the well-known economic principle 
that the fixed cost of a firm does not affect its marginal cost. Given a short-run total-cost 
function 

C = Q 3 — 4Q 2 + 10£) -t- 75 

the marginal-cost function (for infinitesimal output change) is the limit of the quotient 
AC/A0, or the derivative of the C function: 

clC , 

-^=32 2 -8£>+10 

d Q 

whereas the fixed cost is represented by the additive constant 75. Since the latter drops out 
during the process of deriving dC/dQ, the magnitude of the fixed cost obviously cannot 
affect the marginal cost, 

In general, if a primitive function y = j\x) represents a total function, then the deriva¬ 
tive function dyjdx is its marginal function. Both functions can, of course, be plotted 
against the variables graphically; and because of the correspondence between the deriva¬ 
tive of a function and the slope of its curve, for each value of x the marginal function should 
show the slope of the total function at that value ofx In Fig. 7.1a, a linear (constant-slope) 
total function is seen to have a constant marginal function. On the other hand, the nonlin¬ 
ear (varying-slope) total function in Fig, l.\b gives rise to a curved marginal function, 
which lies below (above) the horizontal axis when the total function is negatively 
(positively) sloped. And, finally, the reader may note from Fig. 7.1c (cf. Fig. 6,5) that 
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FIGURE 7.1 




■‘nonsmoothness" of a total function will result in a gap (discontinuity) in the marginal or 
derivative function. This is in sharp contrast to the cvcrywhere-smooth total function in 
Fig. 7,1/> which gives rise to a continuous marginal function. For this reason, the smooth¬ 
ness of a primitive function can be linked to the continuity obits derivative function. In par¬ 
ticular, instead of saying that a certain function is smooth (and differentiable) everywhere, 
we may alternatively characterize it as a function with a continuous derivative function, and 
refer to it as a continuously differentiable function. 

The following notations are often used to denote the continuity and the continuous 
differentiability of a function /: 

/ e C (0) or / e C: /is continuous 

/ e C' 1 " or / e C': f is continuously differentiable 
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where C (U \ or simply C, is the symbol for the set of all continuous functions, and C [] \ or 
C', is the symbol for the set of all continuously differentiable functions. 

Product Rule 

The derivative of the product of two (differentiable) functions is equal to the first function 
times the derivative of the second function plus the second function times the derivative 
of the first function: 

^[f(x)g{x)] = f(X)^- S {X) + g(x)^f{x) 

= f{x)g'(x)+g(x)f'(x) ( 7 . 4 ) 

It is also possible, of course, to rearrange the terms and express the rule as 

^[fU)g(x)] = f'imX) + f{x)g(x) ( 7 . 4 ') 

Example 5 Fin d derivative of y = (2x + 3)(3x 2 ). Let f(x) = 2x + 3 and g(x) = 3x 2 . Then it follows 
- that f'{x) = 2 and jf(x) = 6x, and according to (7.4) the desired derivative is 

f[(2x + 3)(3x 2 )] = (2x + 3 )(6x) + (3x 2 )(2) = 18x 2 + 18x 

This result can be checked by first multiplying out f(x)g(x) and then taking the deriva¬ 
tive of the product polynomial. The product polynomial is in this case f(x)g(x) = 
(2x + 3)(3x 2 ) = 6x 3 +- 9x 2 , and direct differentiation does yield the same derivative, 
18x 2 +18x. 


The important point to remember is that the derivative of a product of two functions is 
not the simple product of the two separate derivatives. Instead, it is a weighted sum of fix ) 
andg'(jr), the weights being g(x | and/U), respectively. Since this differs from what intu¬ 
itive generalization leads one to expect, let us produce a proof for (7.4). According to 
(6.13), the value of the derivative of f(x)g(x) when.t — A should be 




= lim 

A' = ,V .X->\< 


f(x)g(x)-f(N)gjN) 
X — N 


(7.5) 


But, by adding and subtracting f(x)g(N) iti the numerator (thereby leaving the original 
magnitude unchanged), we can transform the quotieni an the right of (7.5) as follows: 


f(x)g(x) - /(x)gQV) + f(x)gjN) - /(,V)g( M) 
x - N 

= /w £M) m -rm 


x - N 


x - N 


Substituting this for the quotient on the right of (7.5) and taking its limit, wc then get 


d_ 

dx 


lf(x)g(x)] 


x=V 


lim f(x) lim 

t-->N .t-V,'V 


g{x)~g{N) 

x-N 


+ lim g(N) liin 

a-*.V ;c—* V 


./(*) 


rm 


X - N 


(7.5') 
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The four limit expressions in (7.5') are easily evaluated. The first one is /(AO, and the third 
is g(N) (limit of a constant). The remaining two are, according to (6.13), respectively, 
g'(N) and f(N). Thus (7.5') reduces to 


d 

dx 


[/(*)«(*)] 


= f(N)g\M)+g(N)f(N) 


(7.5") 


And, since N represents any value of*, (7.5") remains valid i! we replace every AT symbol 
by x. This proves the rule. 

As an extension of the rule to the case of three functions, we have 


■^-[f(x)g{x)h{x)] =f(x)g(x)h(x) + f(x)g'(x)h(x) 
dx 

+ f{x)g(x)h'(x) [cf. (7.4')] (7.6) 

In words, the derivative of the product of three functions is equal to the product of the sec¬ 
ond and third functions times the derivative of the first, plus the product ol the first and third 
functions times the derivative of the second, plus the product of the first and second func¬ 
tions times the derivative of the third. This result can be derived by the repeated application 
of (7.4). First treat the product ^(jc)^(o:) as a single function, say, 0(x), so that the original 
product of three functions will become a product of two functions, /(x)<£(.r).Tothis, (7.4) 
is applicable. After the derivative of f(x)<j>{x) is obtained, we may reapply (7.4) to the 
product g(x)h(x) = tp(x) to get 0'(x). Then (7.6) will follow. The details arc left to you as 
an exercise. 

The validity of a rule is one thing; its serviceability is something else. Why do we need 
the product rule when we can resort to the alternative procedure of multiplying out the two 
functions f{x) and g{x) and then taking the derivative of the product directly? One answer 
to this question is that the alternative procedure is applicable only to specific (numerical or 
parametric) functions, whereas the product rule is applicable even when the functions are 
given in the generu! form. Let us illustrate with an economic example. 


Finding Marginal-Revenue Function from 
Average-Revenue Function 

If we arc given an average-revenue (AR) function in specific form, 

AR= 15- Q 

the marginal-revenue (MR) function can be found by first multiplying AR by Q to get the 
total-revenue (7?) function: 

R^AR'Q=(\5-Q)Q=\5Q-Q 2 


and then differentiating R: 


JR 




dQ 


= 15 — 2 Q 


But if the AR function is given in the general form AR = f(Q), then the total-revenue 
function will also be in a general form: 

R=AR-Q = f(Q)Q 
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FIGURE 7.2 


and therefore the “multiply out” approach will be to no avail. However, because R is a prod¬ 
uct of two functions of Q, namely, f(Q) and Q itself, the product rule can be pul to work. 
Thus we can differentiate R to get the MR function as follows: 


MR = Jq = f(Q) • 1 + Q • f(Q) = m + QfXQ) (7.7) 

However, can such a general result tell us anything significant about the MR? Indeed it 
can. Recalling that f(Q) denotes the AR function, let us rearrange (7.7) and write 

MR - AR = MR - f(Q) = Qf'(Q) (7.7') 


This gives us an important relationship between MR and AR: namely, they will always 
differ by the amount QfXQ). 

It remains to examine the expression QfXQ )-Its first component Q denotes output and 
is always nonncgativc. The other component, f\Q). represents the slope of the AR curve 
plotted against Q. Since “average revenue” and “price” are but different names for the same 
thing: 


R 




Q 


PQ 

Q 


- p 


the AR curve can also be regarded as a curve relating price P to output Q: P = f{Q). 
Viewed in this light, the AR curve is simply the inverse of the demand curve for the prod¬ 
uct of the firm, i.e., the demand curve plotted after the P and Q axes are reversed. Under 
pure competition, the AR curve is a horizontal straight line, so that f'(Q) = 0 and, from 
(1.7). MR - AR = 0 for all possible values of Q. Thus the MR curve and the AR curve 
must coincide. Under imperfect competition, on the other hand, the AR curve is normally 
downward-sloping, as in Fig. 7.2, so that / '{ Q) < 0 and, from (7.7'), MR - AR<0 for all 
positive levels of output. In this case, the MR curve must lie below the AR curve. 

The conclusion just stated is qualitative\t\ nature; it concerns only the relative positions 
of the two curves. But (7.7') also furnishes the quantitative information that the MR curve 
will fall short of the AR curve at any output level Q by precisely the amount QfXQ). Let 
us look at Pig. 7.2 again and consider the particular output level 'V, For that output, the 
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Example 6 


Example 7 


Example 8 


expression Qf\Q) specifically becomes Nf'(N); if we can find the magnitude of A/'(A) 
in the diagram, we shall know how far below the average-revenue point G the correspond¬ 
ing marginal-revenue point must lie. 

The magnitude of A is already specified. And /'(A ) is simply the slope of the AR curve 
at point G (where Q = A), that is, the slope of the tangent line ,1M measured by the ratio 
of two distances OJ/OM. However, we sec that OJjOM = HJ/HG ; besides, distance HG is 
precisely the amount ofoutput under consideration. A. Thus the distance A/'(;V), by which 
the MR curve must lie below the AR curve at output A. is 

Nf(N) = HG^ = tiJ 

Accordingly, if we mark a vertical distance KG = HJ directly below point G, then point K 
must be a point on the MR curve. (A simple way of accurately plotting KG is to draw a 
straight line passing through point if and parallel to JG\ point K is where that line intersects 
the vertical line AG.) 

The same procedure can be used to locate other points on the MR curve. All wc must do. 
for any chosen point G ! on the curve, is first to draw a tangent to the AR curve at G' that 
will meet the vertical axis at some pointThen draw a horizontal line from G‘ lo the ver¬ 
tical axis, and label the intersection with the axis as H’. If we mark a vertical distance 
K'G' = H'J' directly below point G', then the point K' will be a point on the MR curve. 
This is the graphical way of deriving an MR curve from a given AR curve. Strictly speak¬ 
ing. the accurate drawing of a tangent line requires a knowledge of the value of the deriva¬ 
tive at the relevant output, that is, /'(A); hence the graphical method just outlined cannot 
quite exist by itself. An important exception is the case of a linear A R curve, where the tan¬ 
gent to any point on the curve is simply the given lino itself, so that there is in effect no need 
to draw any tangent at all. Then the graphical method will apply in a straightforward way. 

Quotient Rule 

The derivative of the quotient of two functions, f{x)/g{x). is 

d_ /(x) _ /'(x)g(-r)-/(x)g'(x) 
dx g(x) g 2 {x) 

In the numerator of the right-hand expression, we find two product terms, each involving 
the derivative of only one of the two original functions. Note that fix) appears in the pos¬ 
itive term, and g'( x ) ‘ n the negative term. The denominator consists of the square of the 
function £(*); that is, g 2 ix) = [g(x)] 2 . 

d llx - 3 \ 2(x f 1) - (2x - 3)(1) 5 

dii U + 1 / (* + 1) 2 _ (* + 1) 2 

cf / 5x \ 5(x 2 + 1)-5x(2x> 5(1 - x 2 ) 

’ (* 2 + 1) 2 " (* 2 + 1) 2 

d fax 2 + b\ _ 2ax(cx) - ( ox 2 + b)(c) 
dx\ cx ) (cx) 2 

c(ax 2 - b) ax 2 - b 
(cx ) 2 - cx 2 
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This rule can be proved as follows. For any value of x = N, we have 


d_ J\x) 
dx g(x) 


x—N 


— lim 

X-*jV 


m/g(x)-f(N)/g{N) 

x-N 


(7.8) 


The quotient expression following the limit sign can be rewritten in the form 


Ax)g(N) - fiN)g(x) 1 
g(x)g(N) x-N 

By adding and subtracting f(N)g(N) in the numerator and rearranging, we can further 
transform the expression to 


f(x)g(N) - JXN)g(N) + fjN)g(N) - f(\') g (x) 
g(x)g(N) [ x-N 


1 r g (*) 


g(x)g(K) [ 


x-N 


x-N 


Substituting this result into (7.8) and taking the limit, we then have 


<1 m 


dx g{x) 


— lim 


lim g(,V) lim 


m-f(N) 


=,v ■ 1 '^ v g(x)g(N) ' x -*n x-N 

- Um m , im ^)-g(N) 

x—N J x-N X-N 


g 2 (M) 


Lg(N)f{N) — f(N)g'(N)] [by (6.13)] 


which can be generalized by replacing the symbol N with x, because N represents any value 
of*, This proves the quotient rule. 


Relationship Between Marginal-Cost and 
Average-Cost Functions 

As an economic application of the quotient rule, let us consider the rate of change of aver¬ 
age cost when output varies. 

Given a total-cost function C = C(Q), the average-cost (AC) function is a quotient of 
two functions of Q , since AC = C(Q)/Q, defined as long as Q > 0. Therefore, the rate of 
change of AC with respect to Q can be found by differentiating AC: 


±_ CiQ) = [CXQ)-Q-C(Q)- 1] = j_ r _ CiQ)‘ 
dQ Q Qi Ql ^ Q . 

From this it follows that, for Q > 0, 


(7.9) 


d C(Q) > 
dQ Q < 


if c\Q) = 


(7.10) 


Since the derivative C'(Q) represents the marginal-cost (MC) function, and C(Q)/Q 
represents the AC function, the economic meaning of (7.10) is: The slope of the AC 



160 Part Th ree Com para t i ' v-Stat ic A n a lysis 



curve will be positive, zero, or negative if and only if the marginal-cost curve lies above, 
intersects, or lies below the AC curve. This is illustrated in Fig. 7.3, where the MC and AC 
functions plotted are based on the specific total-cost function 

C=Q i - \2Q 2 + 6QQ 

To the left of Q = 6, AC is declining, and thus MC lies below it; to the right, the opposite 
is true. At Q = 6, AC has a slope of zero, and MC and AC have the same value. 1 

The qualitative conclusion in (7.10) is stated explicitly in terms of cost functions. How¬ 
ever, its validity remains unaffected if we interpret C(£>) as any other differentiable total 
function, with C( Q)/Q and C'( Q) as its corresponding average and marginal functions. 
Thus this result gives us a general marginal-average relationship. In particular, wc may 
point out, the fact that MR lies below AR when AR is downward-sloping, as discussed in 
connection with Fig. 7.2. is nothing but a special case of the general result in (7.10). 

* Note that (7.10) does not state that, when AC is negatively sloped, MC must also be negatively 
sloped; it merely says that AC must exceed MC in that circumstance. At Q = 5 in Fig. 7.3, for 
instance, AC is declining but MC is rising, so that their slopes will have opposite signs. 


EXERCISE 7.2 

1. Given the total-cost function C = Q 3 - 5 Q 2 + 12 Q + 75, write out a variable-cost 
(VC) function. Find the derivative of the VC function, and interpret the economic 
meaning of that derivative. 

2. Given the average-cost function AC = Q 2 - 4Q +174, find the MC function. Is the 
given function more appropriate as a long-run or a short-run function? Why? 
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3. Differentiate the following by using the product rule: 

(a) (9x 2 - 2)(3x - 1) (c) x 2 (4x + 6) (e) (2 - 3x)(1 + x)(x + 2) 

(b) (3x + 10)(6x 2 - 7x) (d) (ax - b)(cx 2 ) (f ) (x 2 - 3)x 1 

4. (a) Given AR = 60 - 3Q, pfot the average-revenue curve, and then find the MR curve 

by the method used in Fig. 7.2. 

(i b ) Find the total-revenue function and the marginal-revenue function mathemati¬ 
cally from the given AR function. 

(c) Doe5 the graphically derived MR curve in ( o ) check with the mathematically 
derived MR function in ( b)7 

(d) Comparing the AR and MR functions, what can you conclude about their relative 
slopes? 

5. Provide a mathematical proof for the general result that, given a linear average curve, 
the corresponding marginal curve must have the same vertical intercept but will be 
twice as steep as the average curve, 

6. Prove the result in (7.6) by first treating g(x)h(x) as a single function, g(x)h(x) = <f>(x ), 
and then applying the product rule (7.4). 

7. Find the derivatives of: 

(o) (x 2 + 3)/x (c) 6x/(x + 5) 

(6) (x + 9)/x (d) (ax 2 + b)/(cx + d) 

8. Given the function f(x) = ax + b, find the derivatives of: 

(o)f(x) (b)xf(x) (c) 1/f(x) (d)f(x)/x 

9. (a) is it true that / € C => feC? 

(. b ) Is it true that f € C =*• f e C'? 

10. Find the marginal and average functions for the following total functions and graph 
the results. 

Total-cost function: 

(o) C — 3Q 2 + 7Q +12 
Total-revenue function: 

(b) R = 10 Q — Q 2 
Total-product function: 

(c) Q = oi + bL 2 - cL 3 (o, fc, c > 0) 


7.3 Rules of Differentiation 

Involving Functions of Different Variables _ 

In Sec. 7,2. we discussed Che rules of differentiation of a sum, difference, product, or quo¬ 
tient of two {or more) differentiable functions of the same variable. Now we shall consider 
cases where there arc two or more differentiable functions, each of which has a distinct 
independent variable. 

Chain Rule 

If we have a differentiable function r = /(y), where y is in turn a differentiable function of 
another variable x, say, y = #(x), then the derivative of z with respect to x is equal to the 
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Example 1 


Example 2 


Example 3 


derivative of i with respect to y, times the derivative of y with respect lo x. Expressed 
symbolically, 


dz_ 

dx 


dz dy 
dy dx 


f'{y)g\x) 


(7.11) 


This rule, known as the chain rule, appeals easily to intuition. Given a Ax, there must result 
a corresponding Ay via the function y = g(jt), but this Ay will in turn bring about a Az 
via the function z = f(y). Thus there is a “chain reaction” as follows: 


. via s . via ./ 

Ax —> Ay —► A z 


The two links in this chain entail two difference quotients, Ay/Ax and A zf Ay, but when 
they are multiplied, the Ay will cancel itself out, and we end up with 


A z Ay A z 
Ay Ax Ax 

a difference quotient that relates A z to Ax. If we take the limit of these difference quotients 
as Ax h. 0 (which implies Ay -> 0), each difference quotient will turn into a derivative; 
i,e„ we shall have ( dz/dy)(dy/dx ) = dz/dx. This is precisely the result in (7.11). 

In view of the function y = g(x), we can express the function z = / (y) asz = f[g(x)\, 
where the contiguous appearance of the two function symbols ,/and g indicates that this is 
a composite function (function of a function). It is for this reason that the chain rule is also 
referred to as the composite-function rule or function-of-a-function rule. 

The extension of the chain rule to three or more functions is straightforward. If we have 
- = f(y).y = £(-*)> and x = A(h-). then 


dz 

dw 


j- j- y~ = w) 

dy dx dw 


and similarly for cases in which more functions arc involved. 


If z= 3 y 2 , where y= 2x + 5, then 

£4;!= w >= ,2 r= i2(2 * +5) 

If y- 3, where y= x 3 , then 

^ = 1(3* 2 >=3x 2 
dx 

The usefulness of this rule can best be appreciated when we must differentiate a function 
such as i= (x 2 + 3x - 2) 17 . Without the chain rule at our disposal, dz/dx can be found 
only via the laborious route of first multiplying out the 17th-power expression. With the 
chain rule, however, we can take a shortcut by defining a new, intermediate variable 
y = x 2 + ix - 2, so that we get in effect two functions linked in a chain: 

i- y 57 and y= x 2 + 3x -2 

The derivative dz/dx can then be found as follows: 

^ ^ = 1 7/ 6 (2a + 3) = 1 7(x 2 + 3x - 2) 16 (2x + 3) 
dx dy dx 
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Example 4 


Translated into economic terms, dR/dQ is the MR function and dQ/dL is the marginal- 
physical-product-of-labor (MPP/ function. Similarly, dR/di has the connotation of the 
marginal-revenue-product-of-labor (MRP^) function. Thus the result shown constitutes the 
mathematical statement of the well-known result in economics that MRP ( = MR • MPP t . 

Inverse-Function Rule 

If the function y = fix) represents a one-to-one mapping, i.e., if the function is such thai 
each value ofy is associated with a unique value of x, the function/will have an inverse 
function x = /“' (v) (read: “x is an inverse function of/'). Here, the symbol f~ ] is a func¬ 
tion symbol which, like the derivative-function symbol f, signifies a function related lo 
the function/ it does not mean the reciprocal of the function f(x). 

What the existence of an inverse function essentially means is that, in this case, not only 
will a given value of x yield a unique value of v [that is, y — /(.v)], but also a given value 
ofy will yield a unique value of x. To take a nonnumcrical instance, we may exemplify the 
one-to-one mapping by the mapping from the set of all husbands to the set of all wives in a 
monogamous society. Each husband has a unique wife, and each wife has a unique hus¬ 
band. In contrast, the mapping from the set of all fathers to the set of all sons is not one-to- 
one, because a father may have more than one son, albeit each son has a unique father, 
When x andy refer specifically to numbers, the property of one-to-one mapping is seen 
to be unique to the class of functions known as strictly monotonic (or monotone) functions. 
Given a function fix), if successively larger values of the independent variable x always 
lead to successively larger values of /(x), that is, if 

x, > x 2 =*■ fix i) > fix2) 

then the function/is said to be a strictly increasing function. If successive increases in x 
always lead to successive decreases in f(x), that is, if 

X[ > X 2 => fix 1) < f(x 2 ) 

on the other hand, the function is said to be a strictly decreasing function. In either of these 
cases, an inverse function / 1 exists. 1 

A practical way of ascertaining the strict monotonicity of a given function y = fix) is 
to check whether the derivative f(x) always adheres to the same algebraic sign (not zero) 
for all values of x. Geometrically, this means that its slope is either always upward or always 

f By omitting the adverb strictly, we can define monotonic (or monotone) functions as follows: An 
increasing function is a function with the property that 

x, > *2 =*• f(*i)> f{x2) [with the weak inequality >] 

and a decreasing function is one with the property that 

X] > x 2 => f(x i)< f(x 2) [with the weak inequality <] 

Note that, under this definition, an ascending (descending) step function qualifies as an increasing 
(decreasing) function, despite the fact that its graph contains horizontal segments. Since such 
functions do not have a one-to-one mapping, they do not have inverse functions. 


Given a total-revenue function of a firm R =* f(Q), where output Q is a function of labor 
input L, or Q = g(i), find dR/di. By the chain rule, we have 

dR dR dQ 


di dQ dL 


= f'mxn 
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Example 5 


Example 6 


downward. Thus a firm’s demand curve Q = f(P) that has a negative slope throughout is 
strictly decreasing. As such, it has an inverse function P = f~ [ (Q), which, as mentioned 
previously, gives the average-revenue curve of the firm, since P = AR. 


The function 


y = 5x + 25 

has the derivative dyjdx = 5, which is positive regardless of the value of x; thus the function 
is strictly increasing. It follows that an inverse function exists. In the present case, the inverse 
function is easily found by solving the given equation y - 5x + 25 for x. The result is the 
function 

x=ly-5 

It is interesting to note that this inverse function is also strictly increasing, because 
dx/dy = f > 0 for all values of y. 

Generally speaking, if an inverse function exists, the original and the inverse functions 
must both be strictly monotonic. Moreover, if/ 1 is the inverse function of/ then/must 
be the inverse function of /“'; that is,/and / 1 must be inverse functions of each other. 

Tt is easy to verify that the graph of y = f(x) and that of x = f~' (y) are one and the 
same, only with the axes reversed. If one lays the x axis of the / -1 graph over the x axis of 
the/graph (and similarly for the y axis), the two curves will coincide. On the other hand, if 
the x axis of the / 1 graph is laid over they axis of the/graph (and vice versa), the two 
curves will become mirror images of each other with reference to the 45' line drawn 
through the origin. This mirror-image relationship provides us with an easy way of graph¬ 
ing the inverse function / -l , once the graph of the original function/is given. (You should 
try this with the two functions in Example 5.) 

For inverse functions, the rule of differentiation is 

iix 1 
dy dyjdx 

This means that the derivative of the inverse function is the reciprocal of the derivative of 
the original function; as such, clx jdy must take the same sign as dyjdx, so that if/is strictly 
increasing (decreasing), then so must be /“*. 

As a verification of this rule, we can refer back to Example 5, where dyjdx was found to 
be 5, and <Jx jdy equal to ^. These two derivatives are indeed reciprocal to each other and 
have the same sign. 

In that simple example, the inverse function is relatively easy to obtain, so that its 
derivative dx/dy can be found directly from the inverse function. As Example 6 shows, 
however, the inverse function is sometimes difficult to express explicitly, and thus direct 
differentiation may not be practicable. The usefulness of the inverse-function rule then 
becomes more fully apparent. 

Given y = x 5 + x, find dx/dy. First of all, since 

^ « 5x 4 + 1 > 0 
dx 
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for any value of x, the given function is strictly increasing, and an inverse function exists. To 
solve the given equation for x may not be such an easy task, but the derivative of the inverse 
function can nevertheless be found quickly by use of the inverse-function rule: 

dx_ 1 1 

dy dy/dx 5x s + 1 

The inverse-function rule is, strictly speaking, applicable only when the function involved 
is a one-to-one mapping. In fact, however, we do have some leeway. For instance, when 
dealing with a U-shaped curve {not strictly monotonic), we may consider the downward- 
and the upward-sloping segments of the curve as representing two separate functions, each 
with a restricted domain, and each being strictly monotonic in the restricted domain. To 
each of these, the inverse-function rule can then again be applied. 


EXERCISE 7.3 

1. Given y = u 3 -t- 2 u, where u = 5 — x 2 , find dy/dx by the chain rule. 

2. Given w = ay 2 and y=bx 2 + cx, find dwfdx by the chain rule. 

3. Use the chain rule to find dy/dx for the following: 

(fl) y = (3x 2 - 13) 3 (b) y - ( 7x s - 5) 9 (c) y = (ax + b ) 5 

4. Given y = (16x + 3)“ 2 , use the chain rule to find dy/dx. Then rewrite the function as 
y = 1 /(1 6x + 3) 2 and find dy/dx by the quotient rule. Are the answers identical? 

5. Given y - 7x + 21, find its inverse function. Then find dy/dx and dx/dy, and verify the 
inverse-function rule. Also verify that the graphs of the two functions bear a mirror- 
image relationship to each other. 

6. Are the following functions strictly monotonic? 

(a) y = —x 6 + 5 (*>0) 

(b) y = 4x s + x 3 +3x 

For each strictly monotonic function, find dx/dy by the inverse-function rule. 

7.4 Partial Differentiation 


Hitherto, we have considered only the derivatives of functions of a single independent vari¬ 
able. In comparative-static analysis, however, we are likely to encounter the situation in 
which several parameters appear in a model, so that the equilibrium value of each endoge¬ 
nous variable may be a function of more than one parameter. Therefore, as a final prepara¬ 
tion for the application of the concept of derivative to comparative statics, we must learn 
how to find the derivative of a function of more than one variable. 

Partial Derivatives 

Let us consider a function 


v = f\ x 11 * 2 , (7.12) 

where the variables Xj (/ = 1,2,are all independent of one another, so that each can 
vary by itself without affecting the others. Jf the variable x-. undergoes a change Ax[ while 
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Example 1 


X 2 , ...,x n all remain fixed, there will be a corresponding change in y. namely. Ay. The 
difference quotient in this case can be expressed as 

Ay _ /(xi+Axi, *2, •■•.*,,) ~ /(*|.X2, •••,*«) 
aT _ Ax, ^ ' 

If we take the limit of Ay/Axi as Ax, 0. that limit will constitute a derivative. We call 
it the partial derivative ofy with respect to x,, to indicate that all the other independent 
variables in the function are held constant when taking this particular derivative. Similar 
partial derivatives can be defined for infinitesimal changes in the other independent vari¬ 
ables. The process oftaking partial derivatives is called partial differentiation. 

Partial derivatives are assigned distinctive symbols. In lieu of the letter d (as in dyjdx), 
we employ the symbol 3, which is a variant of the Greek S (lowercase delta). Thus we 
shall now write dy/dx,, which is read: “the partial derivative ofy with respect to a-,.’’ The 

i) 

partial-derivative symbol sometimes is also written as y; in that case, its 3/3x,- part can 

(IX; 

be regarded as an operator symbol instructing us to take the partial derivative of (some 
function) with respect to the variable x;. Since the function involved here is denoted in 
(7.12) by /, it is also permissible to write 3//3x;. 

Is there also a partial-derivative counterpart for the symbol f{x) that we used before? 
The answer is yes. Instead of/', however, we now use /,. /, etc., where the subscript in¬ 
dicates which independent variable (alone) is being allowed to vary, if the function in (7.12) 
happens to be written in terms of unsubscriptcd variables, such as y = /(«. v. tv), then the 
partial derivatives may be denoted by f u , /,., and f ( rather than f\, / . and /. 

In line with these notations, and on the basis of (7.12) and (7.13). we can now define 


dy 




3x] 


lim 

■n-* 


Ay 
o Axi 


as the first in the set of n partial derivatives of the function f 


Techniques of Partial Differentiation 

Partial differentiation differs from the previously discussed differentiation primarily in that 
we must hold (n - 1 ) independent variables constant while allowing one variable to vary. 
Inasmuch as we have learned how to handle constants in differentiation, the actual differ¬ 
entiation should pose little problem. 

Given y = f(x ,, x 2 ) = 3xf + xi x 2 + 4x|, find the partial derivatives. When finding dy/dx ,, 
(or l}), we must bear in mind that x 2 is to be treated as a constant during differentiation. 
As such, x 2 will drop out in the process if it is an additive constant (such as the term 4x^) but 
will be retained if it is a multiplicative constant (such as in the term *i x 2 ). Thus we have 

ay , . 

-A = /t = 6xi + x 2 
3xi 

Similarly, by treating xi as a constant, we find that 
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Note that, like the primitive function f, both partial derivatives are themselves functions 

of the variables *i and x 2 . That is, we may write them as two derived functions 

f, = U Or, x 2 ) and f 2 = M*i, x 2 ) 

For the point (xi, X 2 ) = (1,3) in the domain of the function f, for example, the partial 

derivatives will take the following specific values: 

Ml, 3) = 6(1)+ 3 = 9 and f 2 ( 1, 3) = 1+8(3) = 25 


Example 2 Civen y ~ f ^ u ' ^ = (u + 4)(3uh- 2v), the partial derivatives can be found by use of the 
- product rule. By holding v constant, we have 

f u = (u + 4)(3) + 1 (3u + Iv) = 2{3u + v + 6) 

Similarly, by holding u constant, we find that 

fv = (u + 4)(2) + Q(3u + 2v) = 2(u + 4) 

When u = 2 and v= 1, these derivatives will take the following values: 

f u (2, 1 ) = 2(13) = 26 a nd M2,1) = 2(6) = 12 


Example 3 


Given y~(3u - 2v)/(u 2 + 3v), the partial derivatives can be found by use of the quotient 
rule: 


3 y _ 

3(u 2 + 3v) - 2u(3u - 2v) 

-3u 2 + 4uv+9v 

dU 

(. u 2 + 3v) 2 

(u 2 + 3v) 2 

<£ = 

-2(u 1 + 3v)-3(3u-2v) 

-u(2u + 9) 

3v 

(u 2 + 3v) 2 

(u 2 +3v) 2 


Geometric Interpretation of Partial Derivatives 

As a special type of derivative, a partial derivative is a measure of the instantaneous rates 
of change of some variable, and in that capacity it again has a geometric counterpart in the 
slope of a particular curve. 

Lot us consider a production function Q = 0{K, Z), where Q t K , and L denote output, 
capital input, and labor input, respectively. This function is a particular two-variable ver¬ 
sion of(7,12). with n = 2, We can therefore define two partial derivatives $Q/d K {or Q K ) 
and 3 Q/SL (or Qi ), The partial derivative Q& relates to the rales of change of output with 
respect to infinitesimal changes in capital, while labor input is held constant. Thus Qk 
symbolizes the marginal-physical-product-of-capital (MPP/f) function. Similarly, the par¬ 
tial derivative Qi is the mathematical representation of the MPP/, function. 

Geometrically, the production function Q = Q(K, L) can be depicted by a production 
surface in a 3-space, such as is shown in Tig. 7.4, The variable Q is plotted vertically, so 
that for any point (£, L) in the base plane (KL plane), the height of the surface will indi¬ 
cate the output Q. The domain of the function should consist of the entire nonnegative 
quadrant of the base plane, but for our purposes it is sufficient to consider a subset of it, the 
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FIGURE 7.4 


Q 



rectangle QK^BL^. As a consequence, only a small portion of the production surface is 
shown in the figure. 

Let us now hold capital fixed at the level Kq and consider only variations in the input L. 
By setting K = Kq, all points in our (curtailed) domain become irrelevant except those on 
the line segment K (i B. By the same token, only the curve K fi CDA (a cross section of the 
production surface) is germane to the present discussion. This curve represents a total- 
physical-product-of-labor (TPP,.) curve for a fixed amount of capital K = Kr>; thus we 
may read from its slope the rate of change of Q with respect to changes in L while K is held 
constant, It is clear, therefore, that the slope of a curve such as K 0 CDA represents the geo¬ 
metric counterpart of the parrial derivative Ql . Once again, we note that the slope of a total 
(TPP/.) curve is its corresponding marginal (MPP/ = Qi.) curve. 

As mentioned earlier, a partial derivative is a function of all the independent variables of 
the primitive function. That Qi. is a function of L is immediately obvious from the K 0 CDA 
curve itself. When L - Li,the value of Q L is equal to the slope of the curve at point C; but 
when L = L 2 , the relevant slope is the one at point A Why is Q L also a function of A"? The 
answer is that K can be fixed at various levels, and for each fixed level of K, there results a 
different TPP/ curve (a different cross section of the production surface), with inevitable 
repercussions on the derivative Ql • Hence Qi is also a function of K. 

An analogous interpretation can be given to the partial derivative Qk . If the labor input 
is held constant instead of K (say, at the level of L 0 ), the line segment L ( ,B will be the rel¬ 
evant subset of the domain, and the curve L 0 A will indicate the relevant subset of the pro¬ 
duction surface. The partial derivative Qk can then be interpreted as the slope of the curve 
Lt)A —bearing in mind that the K axis extends from southeast to northwest in Fig. 7.4. It 
should be noted that Qk is again a function of both the variables L and K. 


Gradient Vector 

All the partial derivatives of a function y = f(x 1 , X 2 ,..., x„) can be collected under a sin¬ 
gle mathematical entity called the gradient vector, or simply the gradient , of function /; 

grad /’(.V |,xj . x n ) = (f\.fi . fn) 
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where / = dy/iix ,. Note that we are using parentheses rather than brackets here in writing 

the vector. Alternatively, the gradient can be denoted by V/(.v, x 2 . x„), where V 

(read; “del”) is the inverted version of the Greek letter A. 

Since the function/has n arguments, there are altogether n partial derivatives; hence, 
grad / is an n-vector. When these derivatives arc evaluated at a specific point 
V 20 .x„y) in the domain, we get grad /(.r j 0 , n' 20 . ..., x„ () ). a vector of specific deriva¬ 

tive values. 

Example 4 The gradient vector of the production function Q = Q(K, L) is 

VQ = VQ(K, L) = (Ox, Qt ) 


EXERCISE 7.4 


1. Find Uy/3xi and dyjdx 2 for each of the following functions; 

(a) y = 2x? - 11 xfx 2 + 3x| 

(c) y = (2*i + 3)(X 2 - 2) 

(b) y = 7x-t + 6xi x| - 9xj 

00 Y = (5x, -f 3 )/(x 2 - 2) 

2. Find f, and f Y from the following: 

(o) f(x, y) = x 2 + 5xy-y 3 * 

tofU.fi- 2 *’ 1 * 

(b) fix, y) - (x? - 3y){x - 2 ) 

X 2 - 1 

(d) f{x, y) = - 

xy 


3. From the answers to Prob. 2, find f x ( 1, 2)—the value of the partial derivative f, when 
x = 1 and y = 2—for each function. 

A. Given the production function Q = 96K 03 L° 7 , find the MPP* and MPPi functions. Is 
MPPk a function of K alone, or of both K and L? What about MPPj ? 

5. If the utility function of an individual takes the form 

U = (V(xi,x 2 ) = +2) 2 (*2 + 3) 5 

where U is total utility, and xi and x 2 are the quantities of two commodities consumed; 

(a) Find the marginal-utility function of each of the two commodities. 

(b) Find the value of the marginal utility of the first commodity when 3 units of each 
commodity are consumed. 

6. The total money supply M has two components: bank deposits Dand cash holdings C, 
which we assume to bear a constant ratio C/D = c, 0 < c < 1. The high-powered 
money H is defined as the sum of cash holdings held by the public and the reserves 
held by the banks. Bank reserves are a fraction of bank deposits, determined by the 
reserve ratio r, 0 < r < 1. 

(o) Express the money supply Mas a function of high-powered money H. 

(b) Would an increase in the reserve ratio r raise or lower the money supply? 

(c) How would an increase in the cash-deposit ratio caffect the money supply? 

7. Write the gradients of the following functions; 

(a) f(x,y,z)^x 2 + y i + 2 A 

(b) f(x,y, 2 ) = xyz 
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7.5 Applications to Comparative-Static Analysis _ 

Equipped with the knowledge of the various rules of differentiation, we can at last tackle 
the problem posed in comparative-static analysis: namely, how the equilibrium value ol an 
endogenous variable will change when there is a change in any of the exogenous variables 
or parameters. 


Market Model 

First let us consider again the simple one-commodity market model of (3.1). That model 
can be written in the form of two equations: 


with solutions 


Q = a-bP 
Q = -c + dP 


P % = 


G’ = 


(a, b > 0) [demand] 

(c, d > 0) [supply] 


a+c 
b + d 
ad — be 
b + d 


(7.14) 

(7.15) 


These solutions will be referred to as being in the reduced form: The two endogenous vari¬ 
ables have been reduced to explicit expressions of the four mutually independent parame¬ 
ters a, b, c, and d. 

To find how an infinitesimal change in one of the parameters will affect the value of P\ 
one has only to differentiate (7.14) partially with respect to each of the parameters. If the 
sign of a partial derivative, say, 8P*/8a, can be determined from the given information 
about the parameters, we shall know the direction in which P* will move when the param¬ 
eters changes; this constitutes a qualitative conclusion. If the magnitude of i )P*J8a can be 
ascertained, it will constitute a quantitative conclusion. 

Similarly, we can draw qualitative or quantitative conclusions from, the partial deriva¬ 
tives of Q* with respect to each parameter, such as To avoid misunderstanding, 

however, a clear distinction should be made between the two derivatives 8Q*f8a and 
dQ/da. The latter derivative is a concept appropriate to the demand function taken alone, 
and without regard to the supply function. The derivative 8Q’jda pertains, on the other 
hand, to the equilibrium quantity in (7.15) which, being in the nature of a solution of the 
model, takes into account the interaction of demand and supply together. To emphasize this 
distinction, we shall refer to the partial derivatives of P" and Q* with respect to the param¬ 
eters as comparative-static derivatives. The possibility of confusion between 8Q‘/'du and 
dQ/da is precisely the reason why we have chosen to use the asterisk notation, as in Q* to 
denote the equilibrium value. 

Concentrating on P* for the time being, we can get the following four partial derivatives 
from (7.14): 

I ' 

parameter a has the coefficient ^ ^ ^ 
dP m Q(b + d) - l(a + c) -(a + c) 


dP* _1_ 

da b + d 


(b + dy 


8b 


(b + d? 


[quotient rule] 
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dP* _ 1 (_ 8P *\ 

dc b + d \ da ) 

SP* 0{b + d) - l(a + c) -(a + c) / ?)P*\ 

~M ~ (bTd) 2 “ (b + df \ ~Jb ) 

Since all the parameters are restricted to being positive in the present model, we can 
conclude that 


dP* _ dP * 
da dc 


and 


dP* dP' 

-=- < 0 

9 b dd 


(7.16) 


For a fuller appreciation of the results in (7.16), let us look at Fig. 7.5, where each dia¬ 
gram shows a change in one of the parameters. As before, we are plotting Q (rather than P) 
on the vertical axis, 

Figure 1.5a pictures an increase in the parameter a (to a'). This means a higher vertical 
intercept for the demand curve, and inasmuch as the parameter b (the slope parameter) is 
unchanged, the increase in a results in a parallel upward shift of the demand curve from D 


FIGURE 7.5 




(b) 
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to D'. The intersection of D' and the supply curve 5 determines an equilibrium price P*\ 
which is greater than the old equilibrium price P*. This corroborates the result that 
dP*/‘da > 0, although for the sake of exposition we have shown in Fig. 7.5a a much larger 
change in the parameter a than what the concept of derivative implies. 

The situation in Fig. 7.5 c has a similar interpretation; "but since the increase takes place 
in the parameter c, the result is a parallel shift of the supply curve instead. Noie that this 
shift is downward because the supply curve has a vertical intercept of-c; thus an increase 
in c would mean a change in the intercept, say, from -2 to -4, The graphical comparative- 
static result, that P" exceeds P*. again conforms to what the positive sign of the derivative 
i)P*/Hc would lead us to expect. 

Figures 7.5 b and l.Sd illustrate the effects of changes in the slope parameters b and d 
of the two functions in the model. An increase in h means that the slope of the demand 
curve will assume a larger numerical (absolute) value; i.c., it will become steeper. In 
accordance with the result dP*/3b < 0, we find a decrease in P* in this diagram. The 
increase in d that makes the supply curve steeper also results in a decrease in the equilib¬ 
rium price. T his is, of course, again in line with the negative sign of the comparative-static 
derivative dP*/‘dd. 

Thus far, all the results in (7.16) seem to have been obtainable graphically. If so, why 
should we bother to use differentiation at all? The answer is that the differentiation 
approach has at least two major advantages. First, the graphical technique is subject to a 
dimensional restriction, but differentiation is not. Even when the number of endogenous 
variables and parameters is such that the equilibrium state cannot be shown graphically, we 
can nevertheless apply the differentiation techniques to the problem. Second, the differen¬ 
tiation method can yield results that are on a higher level of generality. The results in (7.16) 
will remain val id, regardless of the specific values that the parameters a, b. c, and d take, as 
long as they satisfy the sign restrictions. So the comparative-static conclusions of this 
model are, in effect, applicable to an infinite number of combinations of (linear) demand 
and supply functions. In contrast, the graphical approach deals only with some specific 
members of the family of demand and supply curves, and the analytical result derived 
therefrom is applicable, strictly speaking, only to the specific functions depicted. 

This discussion serves to illustrate the application o ('partial differentiation to comparative- 
static analysis of the simple market model, but only half of the task has actually been 
accomplished, for we can also find the comparative-static derivatives pertaining to Q*. This 
we shall leave to you as ail exercise. 

National-Income Model 

In place of the simple national-income model discussed in Chap. 3, let us now work with a 
slightly enlarged model with three endogenous variables. Y (national income), C (con¬ 
sumption), and T (taxes): 

F = C + la + Gq 

C = a + fi(Y-T) (a > 0; 0</i<l) (7.17) 

T = y+hY (y > 0 ; 0 < 8 < 1 ) 

The first equation in this system gives the equilibrium condition for national income, while 
the second and third equations show, respectively, how C and T are determined in the model. 
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The restrictions on the values of the parameters a, y, and & can be explained thus; 
is positive because consumption is positive even if disposable income ( Y - T) is zero; fl is 
a positive fraction because it represents the marginal propensity to consume; y is positive 
because even if 7 is zero the government will still have a positive tax revenue (from tax 
bases other than income); and finally, S is a positive fraction because it represents an 
income tax rate, and as such it cannot exceed 100 percent. The exogenous variables A 
(investment) and <7o (government expenditure) are, of course, nonnegalivc. All the param¬ 
eters and exogenous variables are assumed to be independent of one another, so that any 
one of them can be assigned a new value without affecting the others. 

This model can be solved for F* by substituting the third equation of (7.17) into the sec* 
ond and then substituting the resulting equation into the first. The equilibrium income (in 
reduced form) is 


r = ci-Dy + h + C o 

1 — £ -b ^5 V ' 

Similar equilibrium values can also be found for the endogenous variables C and T, but we 
shall concentrate on the equilibrium income. 

From (7.18), there can be obtained six comparative-static derivatives. Among these, the 
following three have special policy significance: 


BY* 1 
3(7^ “ I - ft + ffS 


(7.19) 


or _ -p 

IT ~ 


(7.20) 


dr _ - /? y +/ Cl + Co) -ir 

95 (1 -P + 08) 2 


[by (7.18)] (7.21) 


The partial derivative in {7.19) gives us the government-expenditure multiplier. It has a pos¬ 
itive sign here because p is less than 1. and PH is greater than. zero. If numerical values arc 
given for the parameters fi and 5, we can also find the numerical value of ihis multiplier 
from (7.19). The derivative in (7.20) may be called the nonincome-tux multiplier, because 
it shows how a change in y. the government revenue from nonincome-tax sources, will af¬ 
fect the equilibrium income. This multiplier is negative in the present model because the 
denominator in (7.20) is positive and the numerator is negative. Lastly, the partial deriva¬ 
tive in (7.21)—which is not in the nature of a multiplier, since it does not relate a dollar 
change to another dollar change as the derivatives in (7.19) and (7.20) do—tells us the 
extent to which an increase in the income tax rate 5 will lower the equilibrium income. 

Again, note the difference between the two derivatives iffVdGo and dY/dG (l . The 
former is derived from (7.18), the expression for the equilibrium income. The latter, 
obtainable from the first equation in (7.17), is 97/3(7« = 1, which is altogether different in 
magnitude and in concept. 


Input-Output Model 

The solution of an open input-output model appears as a matrix equation jt* ~ (/ - A)~ ] d. 
If we denote the inverse matrix (/ - A)~ [ by V = [i; !; ], then, for instance, the solution for 
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a three-industry economy can be written as x* - Vcl, or 


—1 

-’ * 

_ ) 


I'M Vi: Vtf 



x7 

— 

v 2\ t’22 t’23 



1 

* 

_1 


Vn t’33_ 


■Jk 


(7.22) 


What are the rates of change of the solution values x* with respect to the exogenous final 
demands d \, d 2 , and d}l The general answer is that 


ddk 


= v lk (j,k= 1.2,3) 


(7.23) 


To see this, lot us multiply out Vd in (7.22) and express the solution as 


r r* 
x ) 


I’llrfl + U|2^2 + v \l‘h 

* 

*2 

= 

L'2i d] + 1'22 d 2 + V2}d) 

V* 

L x i j 


_ t’3ic/l + l ; 32^2 +Vndl_ 


In this system of three equations, each one gives a particular solution value as a function 
of the exogenous final demands. Partial differentiation of those produces a total of nine 
comparative-static derivatives: 


dxf 

Hd\ 


3*f 


a.vf 


= Vi 1 

'ddi 

- r 1 ) 2 


= t ? )3 

dd\ 


3 x 2 


9x7 


= t’2l 

dd 2 

= V 2 2 

del} 

= t’23 



9x7 


3x* 


= 

J 

— l>22 

= t'33 

3d) 

Mi 



(7.230 


This is simply the expanded version of (7.23). 

Reading (7.23') as three distinct columns, wc may combine the three derivatives in each 
column into a matrix (vector) derivative: 


9x* 

9 

r f 1 

i 

Vn 

3x* 

’t f l2 n 

dx" 

1’13 



X* 

— 

U21 

- — 

V22 

- “ 

l>23 



i. 


a d 2 

3d} 


L' V 3 J 




vn_ 


. l ’ 33 . 


Since the three column vectors in (7.23") are merely the columns of the matrix F, by fur¬ 
ther consolidation we can summarize the nine derivatives in a single matrix derivative 
dx*/dd. Given x* = Vd, we can simply write 


= F = (/ - A)~ l 

Thus, (/ - A)~ l . the inverse of the Leontief matrix, gives us an ordered display of all the 
comparative-static derivatives of our open input-output model. Obviously, this matrix 
derivative can easily be extended from the present three-industry model to the general 
^-industry case. 

Comparative-static derivatives of the input-output model are useful as tools of economic 
planning, for they provide the answer to the question: If the planning targets, as reflected in 


8x* 

~3d 


Wu 

V\2 

t'13 

1'21 

V22 

1'23 

l»31 

V32 

'•'33 
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(d\,d 2 i ..., are revised, and if we wish to take care of all direct and indirect require¬ 
ments in the economy so as to be completely free of bottlenecks* how must we change the 
output goals of the n industries? 


EXERCISE 7.5 

1. Examine the comparative-static properties of the equilibrium quantity in (7.15), and 
check your results by graphic analysis. 

2. On the basis of (7.18), find the partial derivatives nY*fda l and Inter¬ 

pret their meanings and determine their signs. 

3. The numerical input-output model (5.21) was solved in Sec. 5.7. 

(a) How many comparative-static derivatives can be derived? 

(b) Write out these derivatives in the form of (7.23') and (7.23"). 


7.6 Note on Jacobian Determinants 


Our study of partial derivatives was motivated solely by comparative-static considerations. 
But partial derivatives also provide a means of testing whether there exists functional 
(linear or nonlinear) dependence among a set of n functions in n variables. This is related 
to the notion of Jacobian determinants (named after Jacobi). 

Consider the two functions 


V J | = 2x\ + 3a - 2 

V; — 4.vf + 12.Yi.A2 +9.Y 2 2 


If we get all the four partial derivatives 


dTi_ _ 2 t^i_ 

il.Vi Da ; 


(h>2 

= 8 x[ + I 2 a 2 

Tv, 


(7.24) 


() y~> 

— = I2jt, - !8.tT 
()X2 


and arrange them into a square matrix in a prescribed order, culled a Jacobian matrix and 
denoted byand then take its determinant, the result will be what is known as a Jacobian 
determinant (or a Jacobian r for short), denoted by |./,: 




0? 1 

Q.v 1 

tU'j 

l)x 2 


(H'2 

<)*i 



2 3 

( 8*1 + I2.t 2 ) (I2*i + I 8 * 2 ) 


(7.25) 


for economy of space, this Jacobian is sometimes also expressed as 


./ 


Q (>'1 . >' 2 ) 

(JfXj, .V?) 


More generally, if wc have n differentiable functions in n variables, not necessarily linear. 


V; =/'(Ai,.V2, ...,-Y„) 
>'2 = /'(A!, X 2 , .... A„) 


(7.26) 


V« =/”(Ai,A;. . ..,.Y„) 
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where the symbol / w denotes the «th function (and not the function raised to the /ith 
power), we can derive a total of n 2 partial derivatives. Adopting the notation fj = dy l fdXj . 
we can write the Jacobian 





a(*],*2, • 

. ,*«) 

dyi/a.t. 

... 3 


... a 


(ly i/to« 

b'nl%X n 


fi 
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(7.27) 


A Jacobian test for the existence of functional dependence among a set of n functions is 
provided by the following theorem: The Jacobian \ J\ defined in (7.27) will be identically 
zero for all values of *,, if and only if the n functions / ',in (7.26) arc func¬ 

tionally (linearly or nonlinearly) dependent. 

As an example, for the two functions in (7.24) the Jacobian as given in (7.25) has the 
value 


| J\ = (24*, + 36x 2 ) - (24*j + 36* 2 ) = 0 

That is, the Jacobian vanishes for all values of *, and * 2 . Therefore, according to the theo¬ 
rem, the two functions in (7.24) must be dependent. You can verify that y> is simply >•, 
squared; thus they are indeed functionally dependent here nonlinearly dependent. 

Let us now consider the special case of linear functions. We have earlier shown that the 
rows of the coefficient matrix A of a linear-equati on system 

«| l*| 4- Ui2*2 4- • • • + Q\nXn = d\ 

«21*1 4- «22*2 H-1- «2„*k = di 2g^ 

^ 1 *j +a n 2*2 -I- +a„ n x n - <i n 

are linearly dependent if and only if the determinant \A\ = 0. This result can now be inter¬ 
preted as a special application of the Jacobian criterion of functional dependence. 

Take the left side of each equation in (7.28) as a separate function of the n variables 
*,,...,*„, and denote these functions by y l5 ..., y„. The partial derivatives of these func¬ 
tions will turn out to be dy-\j 3*i = on, dy\ jdx 2 = < 2 , 2 . etc., so that we may write, in gen¬ 
eral, dyi/dxj = aij. In view of this, the elements of the Jacobian of these n functions will 
be precisely the elements of the coefficient matrix A, already arranged in the correct order. 
That is, we have |J| = \ A\, and thus the Jacobian criterion of functional dependence among 
Vj,.,,, y„ —or, what amounts to the same thing, linear dependence among the rows of the 
coefficient matrix^—is equivalent to the criterion \A\ = 0 in the present linear case. 

We have discussed the Jacobian in the context of a system of n functions in n variables. 
It should be pointed out, however, that the Jacobian in (7.27) is defined even if each func¬ 
tion in (7.26) contains more than n variables, say, n + 2 variables: 

y, = f (*,, *„4 1 , .Tn+ 2 ) (i = 1 , 2 ,..., n) 

In such a case, if wc hold any two of the variables (say, *„+, and x„ +2 ) constant, or treat 
them as parameters, we will again have n functions in exactly n variables and can form a 
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Jacobian. Moreover, by holding a different pair of the* variables constant, we can form a 
different Jacobian. Such a situation will indeed be encountered in Chap. 8 in connection 
with the discussion of the implicit-function theorem. 


EXERCISE 7.6 

1. Use Jacobian determinants to test the existence of functional dependence between the 
paired functions. 

(o) y\ = 3x^ + *2 

y 2 = 9x? + + 4) + x 2 (x 2 + 8 ) + 12 

(b) y] = 3x , 2 +2xf 

y 2 =5xi+1 

2. Consider (7.22) as a set of three functions xj = f'(cf\, d 2 , c/ 3 ) (with i = 1,2, 3). 

(o) Write out the 3 x 3 Jacobian. Does it have some relation to (7.23')? Can we write 

1/1 = IVI? 

(b) Since V s= (/ - 4)" 1 , can we conclude that |V| ^0? What can we infer from this 
about the three equations in. (7.22)1 . 



Chapter 


Comparative-Static 
Analysis of General- 
Function Models 



The study of partial derivatives has enabled us, in Chap. 7, to handle the simpler type of 
comparative-static problems, in which the equilibrium solution ofthe model can be explic¬ 
itly stated in the reduced form. In that case, partial differentiation of the solution will 
directly yield the desired comparative-static information. You will recall that the definition 
of the partial derivative requires the absence of any functional relationship among the 
independent variables (say, x,), so that x, can vary without affecting the values of x 2 . 

_x„. As applied to comparative-static analysis, this means that the parameters and/or 

exogenous variables which appear in the reduced-form solution must be mutually indepen¬ 
dent. Since these are indeed defined as predetermined data for purposes of the model, the 
possibility of their mutually affecting one another is inherently ruled out. The procedure of 
partial differentiation adopted in Chap. 7 is therefore fully justifiable. 

However, no such expediency should be expected when, owing to the inclusion of gen¬ 
eral functions in a model, no explicit reduced-form solution can be obtained. In such cases, 
we will have to find the comparative-static derivatives directly from the originally given 
equations in the model. Take, for instance, a simple national-income model with two 
endogenous variables hand C: 

Y = C + h + G n 

C = C( Y, T„) [To: exogenous taxcsl 

which is reducible to a single equation (an equilibrium condition) 

V = C(Y, 7c) + 7 0 GY 

to be solved for Y". Because of the general form of the C function, however, no explicit 
solution is available. Wc must, therefore, find the comparative-static derivatives directly 
from this equation. How might we approach the problem? What special difficulty might we 
encounter? 

Let us suppose that an equilibrium solution Y* docs exist. Then, under certain rather 
178 general conditions (to be discussed in Section 8.5), we may take Y' to be a differentiable 
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function of the exogenous variables /o, (?o, and 7<). Hence we may write the equation 

r = n/o.o’o.Jb) 

even though we are unable to determine explicitly the form which this function takes. 
Furthermore, in some neighborhood of the equilibrium value F, the following identical 
equality will hold: 

Y' ^ C(T\ To) + To + Go 

This type of identity will be referred to as an equilibrium identity because it is nothing but 
the equilibrium condition with the Y variable replaced by its equilibrium value }'*. Now 
that P has entered into the picture, it may seem at lirst blush that simple partial differentia¬ 
tion of this identity will yield any desired comparative-static derivative, say, dY*/t) 7" 0 . This, 
unfortunately, is not the case. Since P is a function of To. the two arguments of the C func¬ 
tion arc tint independent. Specifically, To can in this case a fleet C not only directly, but also 
indirectly via Y*. Consequently, partial differentiation is no longer appropriate for our 
purposes. How, then, do we tackle this situation? 

The answer is that wc must resort to total differentiation (as against partial differentia¬ 
tion). Based on the notion of total differentials, the process o f total differentiation can lead 
us to the related concept of total derivative, which measures the rale of change of a func¬ 
tion such as C(P, 7o) with respect to the argument To, when To also affects the other 
argument, Y*. Thus, once we become familiar with these concepts, we shall be able lo deal 
with functions whose arguments are not all independent, and that would remove the major 
stumbling block we have so far encountered in our study of the comparative statics of a 
general-function model. As a prelude to the discussion of these concepts, however, wc 
should first introduce the notion of differentials. 

8.1 Differentials 


The symbol dy/dx, for the derivative of the function y = f{x), has hitherto been regarded 
as a single entity. We shall now reinterpret it as a ratio of two quantities, dy and dx. 


Differentials and Derivatives 

By definition, the derivative dy/dx = f{x) is the limit of a difference quotient: 


dy , Av 

— = ( (x)= lim ~ 
dx at-»o Aj 


( 8 . 1 ) 


Thus, by itself, Ay/Ax (without requiring Ax -» 0) is not equal to dy/dx. If we denote 
the discrepancy between the two quotients by S, wc can write 

— - ff- = <5 where !> -+ 0 as Ax ^ 0 [by (8.1)] (8.2) 

Ax dx 

Multiplying (8.2) through by Ax, and rearranging, we have 


dv 

Ay - -pAx 
dx 


$ Ax 


or 


Ay = f(x) Ax 4- H Ax (8.3) 


This equation describes the change iny (Ay) that results from a specific—not necessarily 
small—change in x (Ax) from any starling value of x in the domain of the function 


180 Part Three Comparative-Static A nalysis- 


FIGURE 8.1 




y = f(x). But it also suggests that we can, by ignoring the discrepancy term d Ax, use the 
f'ix) Ax term as an approximation to the true Ay value, where the approximation gets 
progressively better as Ax gets progressively smaller. 

In Fig. 8.1a, when x changes from xi) to x 0 + Ax, a movement from point A to point B 
occurs on the graph of y = fix) . The true Ay is measured by the distance CB. and the ratio 
of the two distances CB/AC = Ay/ Ax can be read from the slope of line segment AB. But 
ifwcdrawa tangent lincAZ) through point A, and use AD in place of AB to approximate the 
value of Ay, we obtain distance CD, which leaves distance DB as the discrepancy or error 
of approximation. Since the slope of AD is / (xy), distance CD is equal to j '(xo) Ax and, 
by (8.3), distance DB is equal to S Ax. Obviously, as Ax decreases, point B would slide 
along the curve toward point A, thereby reducing the discrepancy and making ./'(■*) or 
dyjdx abetter approximation to Ay/Ax. 

Focusing on the tangent line AD, and taking the distance CD as an approximation to Clf 
let us relabel the distances AC and CD by dx and dy. respectively, as in Fig. 8.1ft. Then 

— = slope of tangent AD = fix) 
dx 

and, after multiplying through by dx, we get 

dy = fix ) dx (8.4) 

The derivative fix ) can then be reinterpreted as the factor of proportionality between the 
two finite changes dy and dx. Accordingly, given a specific value of dx. we can multiply it 
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by f f (x) to get dy as an approximation to Ay, with the understanding that the smaller the 
Ax, the better the approximation. The quantities dx and dy are called the differentials of * 
andy, respectively. 

A few remarks are in order regarding differentials as mathematical entities. First, while 
dx is an independent variable, dy is a dependent variable. Specifically, dy is a function of x 
as well as of dx: It depends onx because a different position foT.ro in Fig, 8,! would mean 
a different location for point/I and for its tangent line; it depends on dx because a different 
magnitude of dx would mean a different position for point C as well as a different distance 
CD, Second, if dx = 0, then dy = 0, because point B would in that case coincide with 
point -4. But if dx ^ 0. then it is possible to divide dy by dx to get /'(x>. just as we can 
multiply dx by f\x) to get dy. Third, the differential dy can be expressed only in terms of 
some other differcntial(s)—here, dx, This is because our context calls for the coupling of a 
dependent change dy with an independent change dx. While it makes sense to write 
dy = f(x) dx , it is not meaningful to chop away the dx term on the right and write 
dy = fix). The coupling of the two changes is effected through the derivative f'(x ), 
which may be viewed as a “converter” that serves to translate a given change dx into a 
counterpart change dy. 

The process of finding the differential dy from a given function y = f{x) is called 
differentiation , Recall that we have been using this term us a synonym for derivation, with¬ 
out having given an adequate explanation. In light of our interpretation of a derivative as a 
quotient of two differentials, however, the rationale of the term becomes self-evident. It is 
still somewhat ambiguous, though, to use the single term “differentiation” to refer to the 
process of finding the differential dy as well as to that of finding the derivative dy jdx. To 
avoid confusion, the usual practice is to qualify the word differentiation with the phrase 
“with respect to x” when we take the derivative dyfdx. 


Differentials and Point Elasticity 

To illustrate the economic application of differentials, let us consider the notion of the elas¬ 
ticity of a function. Given a demand function Q = f(P ), for instance, its elasticity is 
defined as (AQ/Q)/(AP/P). Using the idea of approximation explained in Fig. 8.1, we 
can replace the independent change A P and the dependent change AQ with the differen¬ 
tials dP and d(Q , respectively, to gel an approximation elasticity measure known as the pobu 
elasticity of demand and denoted by e t j (the Greek letter epsilon, for “elasticity”): 1 


dQ/Q = dQjdP 
dP/P QfP 


(8.5) 


Observe that on the extreme right of the expression we have rearranged the differentials 
dQ and dP into a ratio dQ/dP , which can be construed as the derivative, or the marginal 
function, of the demand function Q = f{P). Since we can interpret similarly the ratio 
Q jP in the denominator as the average function of the demand function, the point elastic¬ 
ity of demand ej in (8.5) is seen to be the ratio of the marginal function to the average func¬ 
tion of the demand function. 


f The point-elasticity measure can alternatively be interpreted as the limit of 
&P -* 0, which gives the same result as (8.5). 


A Q/Q 

A?/? 


AO/A? 
" Q/P 


as 
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Indeed, this last-described relationship is valid not only for the demand function hut also 
for any other function, because for any given total function y = f(x) we can write the 
formula for the point elasticity ofy with respect to x as 

dyjdx marginal function 

=-Z—— V 8 - 6 ) 

* ylx average function 

As a matter of convention, the absolute value of the elasticity measure is used in decid¬ 
ing whether the function is elastic at a particular point. In the case of a demand function, 
for instance, we stipulate: 


The demand is 


elastic 

of unit elasticity 
inelastic 


at a point when 1^1 ^ 1. 


Example 1 


Find e d if the demand function is Q = 100 - 2P. The marginal function and the average 
function of the given demand are 


dQ 

dP 


2 


and 


Q 100-2P 
7" P 


so their ratio will give us 

-P 


As written, the elasticity is shown as a function of P. As soon as a specific price is chosen, 
however, the point elasticity will be determinate in magnitude. When P = 25, for instance, 
we have = -1, or |r a | = 1, so that the demand elasticity is unitary at that point. When 
P = 30, in contrast, we have |«dl = 1.5; hence, demand is elastic at that price. More gen¬ 
erally, it may be verified that we have |«dl > 1 for 25 < P < 50 and |e<f | < 1 for 0 < P < 25 
in the present example. (Can a price P > 50 be considered meaningful here?) 


Example 2 


Find the point elasticity of supply e s from the supply function Q = P 2 + 7P, and determine 
whether the supply is elastic at P = 2. Since the marginal and average functions are, 
respectively, 


^ = 2P + 7 
dP 


and 


Q 

P 


P + 7 


their ratio gives us the elasticity of supply 

2P + 7 

E5 - nr 

When P =2, this elasticity has the value 11 /9 > 1; thus the supply is elastic at P = 2. 


At the risk of digressing a trifle, it may also be added here that the interpretation of the 
ratio of two differentials as a derivative—and the consequent transformation of the elastic¬ 
ity formula of a function into a ratio of its marginal to its average—makes possible a quick 
way of determining the point elasticity graphically. The two diagrams in Fig. 8.2 illustrate 
the cases, respectively, of a negatively sloped curve and a positively sloped curve. In each 
case, the value of the marginal function at point/I on the curve, or at „v = x„ in the domain, 
is measured by the slope of the tangent line AB. The value of the average function, on the 
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FIGURE 8.2 





other hand, is in each case measured by the slope of line OA (the line joining the point of 
origin with the given points on the curve, like a radius vector), because at point A we have 
y = X()A and x — Ox o, so that the average is yjx = x^AJOxy = slope of OA . The elas¬ 
ticity at point A can thus be readily ascertained by comparing the numerical values of the 
two slopes involved: 7f AB is steeper than OA s the function is elastic at point A; in the 
opposite case, it is inelastic at /l. Accordingly, the function pictured in Fig. 8.2a is inelastic 
sit A (or at x = .to), whereas the one in Fig. 8.2b is elastic at A. 

Moreover, the two slopes under comparison arc directly dependent on the respective 
sizes of the two angles 9 m and 0 a (Greek letter theta; the subscripts m and a indicate mar¬ 
ginal and average, respectively). Thus we may, alternatively, compare these two angles in¬ 
stead of the two corresponding slopes. Referring to Fig. 8.2 again, you can see that S m < B a 
at point A in diagram a, indicating that the marginal falls short of the average in numerical 
value; thus the function is inelastic at points. The exact opposite is true in Fig. 8.2 b. 

Sometimes, we are interested in locating a point of unitary elasticity on a given curve. 
This can now be done easily, ff the curve is negatively sloped, as in Fig. 8.3a. we should 
find a point C such that the line OC and the tangent BC will make the same-sized angle with 
the x axis, though in the opposite direction. In the case of a positively sloped curve, as in 
Fig. 8.36, one has only to find a point C such that the tangent line at C, when properly 
extended, passes through the point of origin. 
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We must warn you that the graphical method just described is based on the assumption 
that the function v = f(x) is plotted with the dependent variabley on the vertical axis. In 
particular, in applying the method to a demand curve, we should make sure that Q is on the 
vertical axis. (Now suppose that Q is actually plotted on the horizontal axis. How should 
our method of reading the point elasticity be modified?) 


EXERCISE 8.1 

1. Find the differential dy, given: 

(<0 Y- -*(* J + 3) {b) y = (x-8)(7x + 5) (c) y= 

2. Given the import function M = f(Y), where M is imports and Y is national income, 
express the income elasticity of imports swv in terms of the propensities to import, 

3. Given the consumption function C = a - bY (with a > 0: 0 < b < 1): 

(a) Find its marginal function and its average function. 

(0) Find the income elasticity of consumption kcy, and determine its sign, assuming 
Y > 0. 

(c) Show that this consumption function is inelastic at all positive income levels. 

4. Find the point elasticity of demand, given Q = k/P n , where k and n are positive 
constants. 

(a) Does the elasticity depend on the price in this case? 

(f>) In the special case where n= 1, what is the shape of the demand curve? What is 
the point elasticity of demand? 

5. (a) Find a positively sloped curve with a constant point elasticity everywhere on the 

curve. 

(f>) Write the equation of the curve, and verify by (8.6) that the elasticity is indeed a 
constant. 

6 . Given Q = 100 - 2P + 0.027, where Q is quantity demanded, P is price, and Y is 
income, and given P - 20 and Y = 5,000, find the 

(u) Price elasticity of demand. 

(b) Income elasticity of demand. 


8.2 Total Differentials __ 

The concept of differentials can easily be extended to a function of two or more indepen¬ 
dent variables. Consider a saving function 

S = S(Y,i) (8.7) 

where 5 is savings, Y is national income, and i is the interest rate. This function is assumed— 
as all the functions we shall use here will be assumed—to be continuous and to possess 
continuous (partial) derivatives, or, symbolically. / e C. The partial derivative TS'/rtK 
measures the marginal propensity to save. Thus, for any change in Y.dY, the resulting 
change in S can be approximated by the quantity {!)S/‘dY) dY. which is comparable to the 
right-hand expression in (8.4). Similarly, given a change in r,<//, we may take 
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as the approximation to the resulting change in S. The total change in S is then approxi¬ 
mated by the differential 

" = if" + ¥‘" (8 ' 8) 

or, in an alternative notation. 

ciS = dY + S, di 

Note that the two partial derivatives Sy and S, again play the role of “converters” that serve 
to convert the changes dY and dU respectively, into a corresponding change dS . The ex- 
pression dS, being the sum of the approximate changes from both sources, is called the total 
differential of the saving function. And the process of finding such a total differential is 
called total differentiation. In contrast, the two additive components to the right of the 
equals sign in (8.8) arc referred to as the partial differentials of the saving function. 

It is possible, of course, that Y may change while i remains constant. In that case, 
di =0, and the total differential will reduce to dS = (US/BY) dY. Dividing both sides by 
dY , we get 

‘OS / (IS \ 

^ = 

Thus it is clear that the partial derivative BSjdY can also be interpreted, in the spirit of 
Fig. 8.1 b. as the ratio of two differentials dS and dY. with the proviso that i, the other inde¬ 
pendent variable in the function, is held constant. Analogously, we can interpret the partial 
derivative BS/Bi as the ratio of the differential dS (with Y held constant) to the differential 
di. Note that although dS and di can now each stand alone as a differential, the expression 
BS/tii remains as a single entity. 

The more general case of a function of n independent variables can be exemplified by, 
say, a utility function in the general form 

U = U{x [t x 2 ,...,x a ) (8.9) 

The total differential of this function can be written as 



dU = 

w , 

dU , 

at/ 



= — dx 

i + — dx 2 + 

• • • + -— 

dx„ 



9*1 

dX2 

dx n 


or 

dU = 

- U\ dx ] 

+ Uj dx 2 + • • ■ 

+ t dx ,j 

II 


in which each term on the right side indicates the approximate change in U resulting from 
a change in one of the independent variables. Economically, the I’itsi term, U i dx i, means 
the marginal utility of the first commodity times the increment in consumption of that com¬ 
modity, and similarly for the other terms. The sum of these, dll, thus represents the total 
approximate change in utility originating from all possible sources of change. As the rea¬ 
soning in (8.3) shows, dU , as an approximation, tends toward the true change AJ as all the 
dx, terms tend to xero. 

Like any other function, the saving function (8.7) and the utility function (8.9) can both 
be expected to give rise to point-elasticity measures similar to that defined in (8.6). But each 
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elasticity measure must in these instances be defined in terms of the change in one of the 
independent variables only; there will thus be two such elasticity measures to the saving 
function, and n of them to the utility function. These are accordingly called purtial elastie- 
ities. For the saving function, the partial elasticities may be written as 


3S/BY BS Y 
~S/Y~ ~ IF S 


and 


US' i 

Sj i <h .V 


For the utility function, the « partial elasticities can be concisely denoted as follows: 

3U x, ,, n 
= ( ‘ = u ."» 


Example 1 


Find the total differential for the following utility functions, where o, b > 0: 

(a) U(x i, x 2 ) = 0 x 1 + bx 2 

( b) U(x h x 2 ) = xf + xf + xix? 

(c) U(x i,xi) = xfxf 


( 0 ) 


The total differentials are as follows: 

3U 


and 


m 


and 


(0 


and 


iix. 


= U, 


BU u . 
— = U 2 = b 

f)X2 


dU — U i dx i + U 2 dx 2 = adx] +b dx 2 

— = U, = Zx] + x 2 — = U 2 = 3x| + xi 

Jx-j (1X2 

dU = U] dx 1 + U 2 dxi = (2x 1 + x 2 ) dx 1 + (%x\ +- * 1 ) 


BU 


,0-1 .,£> 


a 


QXTX 


= u ! =flxr'x?* 

ax, 12 X, 




— = L/ 2 = bx°X$ 1 = 
c)X 2 12 X? 




*2 


EXERCISE 8.2 


1. Express the total differential dll by using the gradient vector VU. 

2. Find the total differential, given 

(a) z- 3 x 2 + xy- 2y 3 

(b) U = 2x^ + 9x, x 2 + xj 

3. Find the total differential, given 


(a)y = 


Xl 

x, + x 2 




2x,Xj 

X, +X 2 


4. The supply function of a certain commodity is 

Q= a + bP 2 + ft 1/2 (0 < 0, b> 0) [fl: rainfall] 

Find the price elasticity of supply kqp, and the rainfall elasticity of supply fqq. 
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5. 

6 . 

7. 


How do the two partial elasticities in Prob. 4 vary with Pand R? In a strictly monotonic 
fashion (assuming positive Pand /?)? 

The foreign demand for our exports X depends on the foreign income V/ and our price 
level P\ X = Y]‘ z P~ 2 . Find the partial elasticity of foreign demand for our exports 
with respect to our price level. 

Find the total differential for each of the following functions: 

(a)U^-5x i -Uxy-6y s 

(t>) U = 7x 2 y 3 

(c) U a* 3x 3 (8x - 7y) 

(d) U = (5x 2 + 7y)(2x~Ay l ) 


(e)U = 


9 y 3 
x-y 


(f)U=(x-ly) i 


8.3 Rules of Differentials 


A straightforward way of finding the total differential d\\ given a function 

y = f{x\*x 2 ) 

is to find the partial derivatives j\ and I 2 and substitute these into the equation 

dv = f\ dx\ + j 2 dx 2 

But sometimes it may be more convenient to apply certain rules of differentials which* in 
view of their striking resemblance to the derivative formulas studied before, are very easy 
to remember. 

Let k be a constant and u and v be two functions of the variables x\ and . 1 * 2 . Then the 
following rules are valid:’ 

Rule I dk = 0 (of. constant-function rule) 

Rule 11 d(cu tl ) = cnu n 1 du (cf. power-timerion rule) 

Rule III d(u±v)=du±dv (cf. sum-difference rule) 

Rule IV d{uv) = v du + u dv (cf. product rule) 

/u\ 1 

Rule V d I — 1 = — < v du — u dv) {cf. quotient rule) 

\v) v- 

Instead of proving these rules here, we shall merely illustrate their practical application. 


T All the rules of differentials discussed in this section are also applicable when uand v are themselves 
the independent variables (rather than functions of some other variables x\ and x 2 ). 




188 Part Three Comparaiive^Slaric A nalysis 


Example 1 


Example 2 


Example 3 


Find the total differential dyof the function 

y= Sx* + 3x2 

The straightfoward method calls for the evaluation of the partial derivatives fi = 10x, and 
f 2 = 3, which will then enable us to write 

dy = f] dx i + f 2 dx 2 = 1Oxi dx\ + 3 dx 2 

We may, however, let u = 5x, 2 and v= 3x2 and apply the previously given rules to get the 
identical answer as follows: 

dy= d(5x^ + d(3x 2 ) [by Rule III] 

= 10xi dx\ + 3 dx 2 [by Rule II] 


* 1*2 


Find the total differential of the function 

Y= 3xf 

Since fi = 6xi + x\ and f 2 = 2*i x 2 , the desired differential is 

dy = ^6xn + xf) + 2*i x 2 dx 2 

By applying the given rules, the same result can be arrived at thus: 

dy = d(ix^~j + x 2 ) [by Rule III] 

= 6xi dx i + x\ dx i + xi d(x^} [by Rules II and IV] 

= (6x-\ + x^) dx\ + 2xi x 2 dx 2 [by Rule II] 


Find the total differential of the function 


y = 


*1 +* 2 
2x\ 


In view of the fact that the partial derivatives in this case are 


h = 


—(xi + 2 x 2 ) 
2x? 


and 


h = 


2x\ 


(check these as an exercise), the desired differential is 


dy = ( ' X \ + i 2X2 ' ) dx ] + J^dx 2 


2x1 


2x 2 


However, the same result may also be obtained by application of the rules as follows: 
dy = ^ [2xf dC*, + x 2 ) - (x, + x 2 ) d(2xf)] [by Rule V] 

= [2xf(dxi + dx 2 ) - (X] + *2)4*1 dx i j [by Rules III and II] 

|^-2xi (xi + 2 x 2 ) dx i + 2xf dx 2 
-(xi + 2x 2 ) ... 


2 

4x] 

_L 

4x 4 


1 
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These rules can naturally be extended to cases where more than two functions ofo, and 
*2 are involved. In particular, weean add the following two rules to the previous collection: 

Rule VI d{u i v ± w) = du dndv ± dw 

Rule VII d(uvw) = vwdu + inv dv + uvdw 

To derive Rule VII, we can employ the familiar trick of first letting r - mv, so that 

d(uvw) = d(uz) = z du + u dz [by Rule IV] 

Then, by applying Rule IV again to dz, we get the intermediate result 

dz = d{ 1’w) = w dv + v dw 

which, when substituted into the preceding equation, will yield 

d(uvw) = vw du + u(w dv + v dw) = rw du — uwdv + itv dw 
as the desired final result. A similar procedure can be employed to derive Rule VI. 


EXERCISE 8.3 


1. Use the rules of differentials to find (a) dz from z- lx 2 + xy-2y } and (b) dU from 
U = 2k\ + 9 * 1 X 3 + x|. Check your answers against those obtained for Exercise 8.2-2. 

2. Use the rules of differentials to find dy from the following functions: 


(o) y 


*1 

*1 +*2 


(b)y = 


Z*-\X2 

X, ■+ X 2 


Check your answers against those obtained for Exercise 8.2-3. 

3. Given y = 3*i(2x 2 - l)(x 3 + 5) 

(a) Find dy by Rule VII. 

(b) Find the differential erf y, if d *2 = dx 3 = 0. 

4. Prove Rules II, ill, IV, and V, assuming u and v to be the independent variables (rather 
than functions of some other variables). 


8.4 Total Derivatives 


We shall now tackle the question posed at the beginning of the chapter: namely, how can wc 
find the rate of change of the function C( Y“, T„) with respect to T(„ when Y* and 7o are 
related'.’As previously mentioned, the answer lies in the concept of total derivative. Unlike 
a partial derivative, a total derivative docs not require the argument Y* to remain constant 
as 7o varies, and can thus allow for the postulated relationship between the two arguments. 

Finding the Total Derivative 

To carry on the discussion in a general framework, let us consider any function 


y = f(x, vv) where 


x = g(w) 


( 8 . 11 ) 
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The two functions/and g can also be combined into a composite function 

V = /[g(w), w] (8.1V) 

The three variables v,.r, and w are related to one another as shown in Fig. 8.4. In this figure, 
which we shall refer to as a channel map, it is clearly seen that w—the ultimate source of 
change—can affecty through two separate channels: (I) indirectly, via the function g and 
then/(the straight arrows), and (2) directly, via the function / (the curved arrow). The 
direct effect can simply be represented by the partial derivative /,. But the indirect effect 

dx <■)(' dx 

can only be expressed by a product of two derivatives, f x —, or —. by the chain rule 

for a composite function. Adding up the two effects gives us the desired total derivative of 
y with respect lo w: 


dy 

dw 


* 4k 


+ fv 


H v dx 
dx dw 


Qy_ 

dw 


( 8 . 12 ) 


This total derivative can also be obtained by an alternative method: We may first differenti¬ 
ate the function y = f(x, w) totally, to get the total differential 

dy = j\ dx + J w dw 

and then divide through by dw. The result is identical with (8.12). Either way, the process 
of finding the total derivative dy/dw is referred to as the total differentiation of y with 
respect to M'. 

It is extremely important to distinguish between the two look-alike symbols dy/dw and 
dvfdw in (8.12). The former is a total derivative, and the latter, a partial derivative. The 
latter is in fact merely a component of the former. 


Example 1 


Find the total derivative dy/dw, given the function 

y= f(x, w) = lx - w 2 whore x = g(w) = 2w 2 + w + 4 
By virtue of (8.12), the total derivative should be 

— = 3(4w + 1) 4- (-2w) = 10w+ 3 
dw 

As a check, we may substitute the function g into the function f, to get 
y = 3(2 w 2 + w + A)-w 2 = 5 w 2 + 3w+ 12 

which is now a function of w alone. The derivative dy/dw is then easily found to be lOw + 3, 
the identical answer. 
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Example 2 


If we have a utility function U = U(c e s), where c is the amount of coffee consumed and s is 
the amount of sugar consumed, and another function s = g(c) indicating the complemen¬ 
tarity between these two goods, then we can simply write the composite function 


from which it follows that 


U = U[c,g(c)] 


dU 

dc 


at/ 

3c 


su 

dg(c) 




A Variation on the Theme 

The situation is only slightly more complicated when we have 


y = /(j|. x^s vt) whore 


=£(w) 
X2 = h{w) 


( 8 . 13 ) 


The chanael map will now appear as in Kig, 8.5, This time, the variable w can affect y 

through three channels: (1) indirectly, via the function g and then/ (2) again indirectly, via 

the function h and then / and (3) directly via / from our previous experience, these three 

_ , ... . 3y dx] 3y dx-* l)y 

effects are expected to be expressible, respectively as-*-, and—. By 

cJ.tj dw dx 2 aw dw 

adding these together, we get the total derivative 


dy 3y dx\ 3y dx 2 

dw 3x\ dw dw 
„ dx 1 „ dxy 

= —b f :-.—f A 

dw dw 


dy 

*)w 


(8.14) 


which is comparable to (8.12). If we take the total differential d\\ and then divide through 
by dw, we can arrive at the same result. 


Example 3 


Let the production function be 

Q = Q(K,L,t) 

where, aside from the two inputs K and L, there is a third argument t, denoting time. The 
presence of the t argument indicates that the production function can shift over time in 
reflection of technological changes. Thus this is a dynamic rather than a static production 
function. Since capital and labor, too, can change over time, we may write 

K = K(t) and L = L(t) 
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Then the rate of change of output with respect to time can be expressed, in line with the 
total-derivative formula (8.14), as 

dQ_dQdK SQdL dQ 
~dt~jK~dt + jL~di Jr lt 

or, in an alternative notation, 

^ = Q K K'(t)+Q L L\t) + Q t 
at 


Another Variation on the Theme 

When the ultimate source of change, w in (8.13), is replaced by two coexisting sources, u 
and v, the situation becomes the following: 

y = f(iuxi.«,v) I £ I f("; < 8 ' ,5 > 


While the channel map will now contain more arrows, the principle of its construction 
remains the same; we shall, therefore, leave it to you to draw. To find the total derivative ol 
y with respect to u (while v is held constant), let us take the total differential of>\ and then 
divide through by the differentia! du. with the result: 


dy dy dx\ 
du fix) du 


ily dxt dy du dy dv 
dx 2 du du du dv du 


dy dx\ dy dx 2 

dxi du dx 2 du du 


— =0 since v is held constant 
du 


In view of the fact that we are varying u while holding v constant (as a single derivative 
cannot handle changes in u and r both), however, the result obtained must be modified in 
two ways; (1) the derivatives dxi/du and dx 1 /du on the right should be rewritten with the 
partial sign as dx\/du and dxijdu, which is in line with the functions g and h in (8.15); and 
(2) the ratio dy/du on the left should also be interpreted as a partial derivative, even 
though—being derived through the process of total differentiation ofy it is actually in the 
nature of a total derivative. For this reason, we shall refer to it by the explicit name of 
partial total derivative, and denote it by §>'/§w (with § rather than dy in order to distin¬ 
guish it from the simple partial derivative dy/du which, as our result shows, is but one of 
three component terms that add up to the partial total derivative/ 

With these modifications, our result becomes 


5 i = ^ i Sx 1 + _3 2 ,9 fi + ily (816) 

§w 3 aT| du dx 2 du 

which is comparable to (8.14). Note the appearance of the symbol dy/du on the right, 
which necessitates the adoption of the new symbol §y/§« on the left to indicate the broader 


r An alternative way of denoting this partial total derivative is 


dy 

du 


or — 


/constant 


cM-0 
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concept of a partial total derivative. In a perfectly analogous manner, we can derive the 
other partial total derivative, $y/§ti. Inasmuch as the roles of u and v are symmetrical in 
(8.15), however, a simpler alternative is available to us. All we have to do to obtain §v/§ u 
is Jo replace the symbol u in (8.1 6} by the symbol v throughout. 

The use of the new symbols §y/§w and §v/§u for the partial total derivatives, if uncon¬ 
ventional. serves the good purpose of avoiding confusion with the simple parlial deriva¬ 
tives tiy/'du and Hv /hi that can arise from the function /'alone in (8.15). However, in ihe 
special case where the/function takes the form of y = f(x],x{) without the arguments 
u and v, the simple partial derivatives dy/du and dv/dv are nol defined. Hence, it mav not 
be inappropriate in such a case to use the latter symbols for the partial total derivatives of v 
with respect to u and v. since no confusion is likely to arise. Even in that event, though, the 
use of a special symbol is advisable for the sake of greater clarity. 

Some General Remarks 

To conclude this section, we offer three general remarks regarding total derivative and total 
differentiation: 

1. In the cases we have discussed, the situation involves without exception a variable that 
is functionally dependent on a second variable, which is in turn dependent functionally 
on a third variable. As a consequence, the notion of a chain inevitably enters the picture, 
as evidenced by the appearance of a product (or products) of two derivative expressions 
as the component(s)ofa total derivative. For this reason, the total-derivative formulas in 
(8.12), (8.14), and (8.16) can also be regarded as expressions of the chain rule, or the 
composite-function rule—a more sophisticated version of the chain rule introduced in 
Sec. 7.3. 

2. The chain of derivatives does not have to be limited to only two “links'’ (two derivatives 
being multiplied); the concept of total derivative should be extendible to cases where 
there are three or more links in the composite function. 

3. In all cases discussed, lotal derivatives—including those which have been called parlial 
total derivatives —measure rates of change with respect to some ultimate variables in 
the chain or, in other words, with respect to certain variables which arc in a sense 
exogenous and which are not expressed as functions of some other variables. The 
essence of the total derivative and of the process of total differentiation is to make- 
due allowance for all the channels, indirect as well as direct, through which the effects 
of a change in an ultimate independent variable can possibly be carried to the particular 
dependent variable under study. 


EXERCISE 8.4 

1. Find the total derivative dz/dy, given 

(a) z= f(x, y) - 5x + xy- y 1 , where x = g(y) = ly 2 

(b) z— 4s 2 - Ixy+ly 2 , where x = l/y 

(c) e= (* + y)(x-2y), where x = 2 - 7y 

2. Find the total derivative dz/dt, given 

(a) z = x l - 8xy - y 3 , where x = 3f and y= 1 — t 
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(£>) z= 7u+ vt f where u = 2t i and v= t + 1 
(c) z- f(x, y, t ), where x = a-bt and y=c + kt 
3. Find the rate of change of output with respect to time, if the production function is 
Q= A{t)K a L p r where 4(t) is an increasing function of f, and K = Ko + Qt, and 
L = t 0 + bt. 

A. Find the partial total derivatives §W/§u and §W/§v if 
(o) W = ax 2 + bxy + cu, where x = at/ + /Jvand y = yu 
( b ) W = f(x h xz), where = Su 2 + 3v and x 2 = u - 4\P 

5. Draw a channel map appropriate to the case of (8.15). 

6. Derive the expression for §y/§v formally from (8.15) by taking the total differential of y 
and then dividing through by dv. 

8.5 Derivatives of Implicit Functions ____ 

The concept of total differentials can also enable us to find the derivatives of so-called 
implicit functions. 

Implicit Functions 

A function given in the form of y = fix). say. 

y = /<*) = 3/ (8.17) 

is called an explicit junction, because the variable y is explicitly expressed as a function of 
x. If this function is written alternatively in the equivalent form 

y - 3x 4 — 0 (8.17') 

however, we no longer have an explicit function. Rather, the function (8.17) is then only 
implicitly defined by the equation (8.17'). When we are (only) given an equation in the 
form of(8.17'), therefore, the function y = fix) which it implies, and whose specific form 
may not even be known to us, is referred to as an implicit function. 

An equation in the form of (8.17') can be denoted in general by F(y, x) - 0, because 
its left side is a function of the two variables^ andx. Note that we arc using the capital let¬ 
ter F here to distinguish it from the function /; the function F, representing the left-side 
expression in (8.17'), has two arguments ,y and x, whereas the function/; representing the 
implicit function, has only one argument, ,v. There may, of course, be more than two argu¬ 
ments in the/ 7 function. For instance, we may encounter an equation / 7 (y..<i,.... x m ) = 0. 
Such an equation way also define an implicit function y = /Ui, 

The equivocal word may in the last sentence was used advisedly. For, whereas an explicit 
function, say, v = fix), can always be transformed into an equation F{y.x) — 0 by sim¬ 
ply transposing the f{x) expression to the left side of the equals sign, the reverse transfor¬ 
mation is not always possible. Indeed, in certain cases, a given equation in the form ol 
F{y,x) = 0 may not implicitly define a function y = fix). For instance, the equation 
x 2 +y 2 = 0 is satisfied only at the point of origin (0, 0), and hence yields no meaningful 
function to speak of. As another example, the equation 

F(y,x)=x 2 +y 2 -9 = 0 


( 8 . 18 ) 
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FIGURE 8.6 



implies not a function, but a relation, because (8.18) plots as a circle, as shown in Fig. 8.6. 
so that no unique value of y corresponds to each value of*. Note, however, that if we 
restrict y to nonnegative values, then we will have th e upper half of the circle only, and that 
does constitute a function, namely, y = W9 - x 2 . Similarly, the low er half of the circle, 
with v values nonpositive, constitutes another function, >• = — y/9 — x 2 . In contrast, neither 
the left half nor the right half of the circle can qualify as a function. 

In view of this uncertainty, it becomes of interest to ask whether there are known gen¬ 
eral conditions under which we can be sure that a given equation in the form of 

Fly,x (8.19) 

docs indeed define an implicit function 

y = f(xi,...,x m ) (8.20) 

locally, i.e., around some specific point in the domain. The answer to this lies in the 
so-called implicit-function theorem, which states that: 

Given (8.19), if (a) the function h has continuous portal derivatives f\. h\, _ F m , and if 

(b)'M a point (yo, .vio,.. . .* m o) satisfying the equation (8.19), F v is nonzero, then then 1 ex¬ 
ists an m-dimcnsional neighborhood of (xjy,... ,*,„(>), N. in which y is an implicitly defined 

function of the variables *.. in the form of (8,20), This implicit function satisfies 

yo = /(-tio> • • •, x m o). It also satisfies the equation (8.19) for every M-tuplc (x \,..., x m ) in 
the neighborhood N —thereby giving (8.19) the status of an identity in that neighborhood. 
Moreover, the implicit function/is continuous and has continuous partial derivatives 

f\ > • • • i Jm • 

Let us apply this theorem to the equation of the circle, (8.18), which contains only one 
x variable. First, we can duly verify that F y = 2 y and F* = 2x are continuous, as required. 
Then wc note that F y is nonzero except when y = 0, that is, except at the leftmost point 
(-3, 0) and the rightmost point (3, 0) on the circle. Thus, around any point on the circle 
except (-3,0) and (3, 0), we can construct a neighborhood in which the equation (8.18) 
defines an implicit function y = f(x). This is easily verifiable in Fig. 8.6, where it is indeed 
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possible to draw, say, a rectangle around any point on the circle- except (-3, 0) and 
(3, 0}—such that the portion of the circle enclosed therein will constitute the graph of a 
function, with a unique y value for each value of it in that rectangle. 

Several things should be noted about the implicit-function theorem. First, the conditions 
cited in the theorem are in the nature of sufficient (but not necessary) conditions. This 
means that if we happen to find t\ = Oat a point satisfying (8.19), we cannot use the the¬ 
orem to deny the existence of an implicit function around that point. For such a function 
may in fact exist (see Exercise 8.5-7). T Second, even if an implicit function/is assured to 
exist, the theorem gives no due as to the specific form the function/takes. Nor, for that 
matter, docs it tell us the exact size of the neighborhood ,V in which the implicit function is 
defined. However, despite these limitations, this theorem is one of great importance. For 
whenever the conditions of the theorem are satisfied, it now becomes meaningful to talk 
about and make use of a function such as (8.20), even if our model may contain an equa¬ 
tion (8.19) which is difficult or impossible to solve explicitly for y in terms of the x 
variables. Moreover, since the theorem also guarantees the existence of the partial deriva¬ 
tives /i...., it is now also meaningful to talk about these derivatives of the implicit 
function. 


Derivatives of Implicit Functions 

If the equation F(y\x x ..r m ) = 0 can be solved for wc can explicitly write out the 

function y = fix\. and find its derivatives by the methods learned before. For 

instance, (8.18) can be solved to yield two separate functions 

= +V9-x 2 [upper half of circle] ^ ^ 

y - -V9 - x 2 [lower half of circle] 

and their derivatives can be found as follows: 

dyy 

dx 


dx 


-^-(<->-x 2 ) ,/2 = i(9r 2 )- 1 2 (-2 jc) 
dx 


-x 


—x 


s/9-.v 2 y 
d 


(y + t* 0) 


( 8 . 21 ) 


- [—(9 - x z ) i!i ] = -H9 - .v 2 r 1/2 (-2.r> 

dx z 



< y- # 0) 


But what if the given equation, F(y, , x m ) = 0, cannot be solved for>’ explicitly? 
In this case, if under the term of the implicit-function theorem an implicit function is 
known to exist, we can still obtain the desired derivatives without having to solve foxy first. 
To do this, we make use of the so-called implicit-function rule—a rule that can give us 
the derivatives of every implicit function defined by the given equation. The development 
of this rule depends on the following basic facts: (1) if two expressions are identically 


T On the other hand, if F Y = 0 in an entire neighborhood, then it can be concluded that no implicit 
function is defined in that neighborhood. By the same token if Fy = 0 identically, then no implicit 
function exists anywhere, 
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equal, their respective total differentials must he equal;* (2) differentiation of an expres¬ 
sion that involves y, x t ,...,x w will yield an expression involving the differentials 
dy, dx i,,.., dx m ; and (3) the differential of y.dy, can be substituted out, so the fact that 
we cannot solve fory does not matter, 

Applying these facts to the equation F(y, Xu x m ) = 0—which, we recall, has the 
status of an identity in the neighborhood N in which the implicit function is defined—we 
can write dF — dO, or 

Fy dy + F i dx\ F 2 dxj + • • • + F m dx m = 0 (8.22) 

Since the implicit function y = f(x u x 2 ,..., x m ) has the total differential 

dy = f dx i +fidx 2 + •' • + f n dx„ 

we can substitute this dy expression into (8.22) to get (after collecting terms) 

(Fy f + ft) dx\ + (Fy f 2 + Fj) dxi 4--h (Fy f m + F„,) dx m = 0 (8.22') 

The fact that all the dx t can vary independently from one another means that, for the equa¬ 
tion (8.22') to hold, each parenthesized expression must individually vanish; i.e., we must 
have 


F y j] + Fj = 0 (for all /) 

Dividing through by F v , and solving for f, we obtain the so-called implicit-function rule 
for finding the partial derivative f of the implicit function y — f(x\, x 2 ,..., x m ): 

= 0=1,2,...,™) (8,23) 


In the simple case where the given equation is F(y, x) = 0, the rule gives 

d L = _ F F 

dx Fy 


( 8 . 23 ') 


x Take, for example, the identity 

X 2 -y 2 =( X +y)(x-y) 

This Is an identity beeause the two sides are equal for any values of x and y that one may assign. 
Taking the total differential of each side, we have 

dfleft side) = 2x dx - 2y dy 
d( right side) = U - y) d(x + y) + (x + y) - y) 

= (x - y)(dx + dy) + (x + y){dx - dy) 

= 2xdx - 2ydy 

The two results are indeed equal. If two expressions are not identically equal, but are equal only 
for certain specific values of the variables, however, their total differentials will not be equal The 
equation 

for instance, is valid only for y = ± 1. The total differentials of the two sides are 

d(left side) = 2x dx - 2y dy 
d(rlght side) = 2x dx + 2y dy 

which are not equal. Note, In particular, that they are not equal even at y = ± 1 . 
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Example 1 


Example 2 


Example 3 


Example 4 


What this rule states is that, even if the specific form of the implicit function is not known 
to us, we can nevertheless find its derivative(s) by taking the negative of the ratio of a pair of 
partial derivatives of the F function which appears in the given equation that defines the im¬ 
plicit function. Observe that F y always appears in the denominator of the ratio. This being 
the case, it is not admissible to have F v = 0. Since the implicit-function theorem specifies 
that F v ± 0 at the point around which the implicit function is defined, the problem of a zero 
denominator is automatically taken care of in the relevant neighborhood of that point. 

Find dy/dx for the implicit function defined by (8.17'). Since F (y, x) takes the form of 
y - 3x 4 , we have, by (8.23'), 

dy = _f, = _ : u^ = Ux i 

dx Fy 1 

In this particular case, we can easily solve the given equation for y to get y = 3x 4 . Thus the 
correctness of the derivative is easily verified. 


Find dy/dx for the implicit functions defined by the equation of the circle (8.18). This 
time we have F(y, x) = x 1 + y 2 - 9; thus F r = 2y and F x = lx. By <8.23'), the desired 
derivative is 


dy _ lx 
dx ~ " 2y 


~y 


t 


Earlier, it was asserted that the implicit-function rule gives us the derivative of every implicit 
function defined by a given equation. Let us verify this with the two functions in (8.18') and 
their derivatives in (8.21). If we substitute y+ for y in the implidt-function-mle result 
dy/dx = -x/y, we will indeed obtain the derivative dy 1 /dx as shown in (8.21); similarly, 
the substitution of y~ for y will yield the other derivative in (8.21). Thus our earlier assertion 
is duly verified. 


Find dy/tfx for any implicit function(s) that may be defined by the equation F(y, x, w) = 
yi x 2 + w* + yxw -3 = 0. This equation is not easily solved for y. But since f Y , f„, and F w 
are all obviously continuous, and since F y =■ ly 2 x 2 + xw is indeed nonzero at a point such 
as (1, 1, 1) which satisfies the given equation, an implicit function y = f(x, w) assuredly 
exists around that point at least. It is thus meaningful to talk about the derivative Dy/dx. By 
(8.23), moreover, we can immediately write 

Dy F„ _ 2 y 3 x + yw 

Dx ~ F y 3y 2 x 2 + xw 
At the point (1,1,1), this derivative has the value -}. 


Assume that the equation F(Q, K,L) = 0 implicitly defines a production function Q = 
f(K, L). Let us find a way of expressing the marginal physical products MPPx and MPPj. in 
relation to the function F. Since the marginal products are simply the partial derivatives 
DQ/DK and DQ/DL, we can apply the implicit-function rule and write 


MPP k 


«Q 

DK 


— and MPP; = ^ 
Fq oL 


h 

F Q 


t Th e restriction y 56 0 is of course perfectly consistent with our earlier discussion of the equation 
(8.18) that follows the statement of the implicit-function theorem. 
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Aside from these, we can obtain yet another partial derivative, 

dK _ F l 

fiT - "77 

from the equation F(Q, K, L) = 0. What is the economic meaning of 3K/dL? The partial 
sign implies that the other variable, Q, is being held constant; it follows that the changes in 
fCand L described by this derivative are in the nature of "compensatory" changes designed 
to keep the output Q constant at a specified level. These are therefore the type of changes 
pertaining to movements along a production isoquant drawn with the K variable on the ver¬ 
tical axis and the L variable on the horizontal axis. As a matter of fact, the derivative UK/<IL 
is the measure of the slope of such an isoquant, which is negative in the normal case. The 
absolute value of HK/ilL, on the other hand, is the measure of the marginal rate of technical 
substitution between the two inputs. 


Extension to the Simultaneous-Equation Case 

The implicit-function theorem also comes in a more general and powerful version that 
deals with the conditions under which a set of simultaneous equations 


.... ...,.v w ) = 0 

f r 2 (y\’, • • •»-Kb..., Xgi) = o 


F (,V| > • • • i -*•! i - • • • X m ) — 0 
will assuredly define a set of implicit functions 1 ” 


y\ = f\x u ...,x m ) 

>'2 = f\x l, ... ..V w ) 


y„ = f n (x u ...,x K ) 


(8.24) 


(8.25) 


The generalized version of the theorem states that: 

Given the equation system (8.24), if (a) the Junctions F 1 ,.... F n all have continuous partial 
derivatives with respect to all the v and* variables, and if (■ h) at a point 1 , y^; 

*10 . x m \)) satisfying (8.24). the following Jacobian determinant is nonzero: 




9(ji. y n ) 


a f' 

3 F' 

3 F ] 

9v[ 

()y2 

fyn 

HF 2 

5F 2 

a f 2 

3yi 

fiy'2 

<fy„ 

a f* 

HF n 

dF" 

9y\ 

dyi 



f To view it another way, what these conditions serve to do is to assure us that the n equations in 
(8.24) can in principle be solved for the n variables— y\y n —even if we may not be able to obtain 
the solution (8.25) in an explicit form. 
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then there exists an m -dimensional neighborhood of(,rio,.. -, A: m o), /V, in which the variables 

r h .... y„ arc functions of the variables *]-- x m in the form of (8.25). These implicit 

functions satisfy 

>■'10 = /'(-rio,. 


v„u — j"(x 10 . .-.,-X ffl o) 


They also satisfy (8.24) for every w-tupie (.v,.v,„) in the neighborhood A thereby giv¬ 

ing (8.24) the status of a set of identities as far as this neighborhood is concerned. Moreover. 

the implicit functions /'_ f“ are continuous and have continuous partial derivatives with 

respect to all the x variables. 

As in the single-equation case, it is possible to find the partial derivatives of the implicit 
functions directly from the n equations in (8.24). without having to solve them Jor they 
variables. Taking advantage of the fact that, in the neighborhood N. the equations in (8.24) 
have the status of identities, we can take the total differential of each of these, and write 
df) = 0 (_/ = 1,2,..., n). The result is a set of equations involving the differentials 
dy { dy„ and dx [...., dx m . Specifically, after transposing the dx, terms lo the right of 
the equals signs, we have 


SF 

8 vi 


i)F- 

dy { 


dy\ 


dy\ 


ST 

i)yi 

dr 7 

dv-> 


dyi 


dy 2 -I- • 


dF 1 

+ —dy n 


dy„ 

3F 2 


a/-’ 1 

Ox] 


dx i -f • • • + 


’dy>i — 


dh 


■2 


3x 


dx 


DF 

<l.r„ 


i)F‘ 


dx 


m 


iix 


dx, 


m 


( 8 . 26 ) 


dF ’ 1 tiF" , dF” (dF" dF" \ 

- dy\ +- dy 2 + • • • + -— dv„ - - —dx i + - • • + -— d.x m 

<iy’[ dy 2 dy„ \ dxi dx m / 

Moreover, from (8.25), we can write the differentials of the yj variables as 


dv\ - —dx 
dx\ 


by 

dx- 


dx 2 


, hi , . hi, 

dv2 = —dx i 4 - -— dx2 
ch*i dxi 


+ 


dx m 

drj 

3.r™ 


dx, 


dx, 


(8.27) 


hn , . h’» , 

= —dx i + —dx 2 

i)Xi 0X2 


d}’n 

<L\\„ 


dx, 


and these can be used to eliminate the dy) expressions in (8.26), But since the result of 
substitution would be unmanageably messy, let us simplify matters by considering only what 
would happen when X| alone changes while all the other variables xi,...,x m remain 
constant. Letting dx\ 0, but setting dx 2 = ■ ■■ = dx„, = 0 in (8.26) and (8.27), then 
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substituting (8.27) into (8.26) and dividing through by dx i ^ 0, we obtain the equation system 

SI' 1 ( + 3_P_ / ^2 \ + + FF^ f _ fl 7 ' 

‘Oyi \y.vj / dyi \ dx I / 3y„ \ 3x x J dx { 

BF 2 /3>'i \ 3F 2 / 9i : 2 \ 3F 2 /dy n \_ 3F 2 

<)>'i l/)xi / riy 2 V (l.Vi / ;jy„ V Dxi / 3x\ (8.28) 


3F" /tty, 

<hi ViJ.V] 


U F" id_y2 
dy 2 \ dx i 


3_T(3 yjL 

3y„ la.vi 


Even this result—for the ease where Xi alone changes—looks formidably complex, 
because it is full of derivatives. But its structure is actually quite easy to comprehend, once 
wc learn to distinguish between the two types of derivatives that appear in (8.28). One type, 
which we have parenthesized for visual distinction, consists of the partial derivalives of the 
implicit functions with respect to tci that we are seeking. These, therefore, should be viewed 
as the “variables” to be solved for in (8.28). The other type, on the other hand, consists ofthe 
partial derivalives of the F 1 functions given in (8.24). Since they would nil take specific val¬ 
ues when evaluated at the point (>■«•),..., y n «:■ ■■. x m )— the point around which the 
implicit functions are defined they appear here not as derivative functions but as derivative 
values. As such, they can be treated as given constants. This fact makes (8.28) a linear equa¬ 
tion system, with a structure similar to (4.1). What is interesting is that such a linear system 
has arisen during the process of analysis of a problem that is not necessarily linear in itself, 
since no linearity restrictions have been placed on the equation system (8.24). Thus we have 
here an illustration of how linear algebra can come into play even in nonlinear problems. 

Being a linear equation system, (8.28) can be written in matrix notation as 


3F l 

8F ] 

_i 

m \ 

DF [ 




w 

dx [ 

HF 2 

bf 2 

3F 1 

(is) 

8F 2 

<fy\ 

d}'2 


UJ = 

= fl-V] 

dF a 

BF” 

BF n 

(^ v ” \ 

BF n 

3}'\ 

Hy 2 

fi.i’i, J 

U*i/ 

i).*l 


(8.28') 


Since the determinant of the coefficient matrix in (8.28') is nothing but the particular 
Jacobian determinant |J| which is known to be nonzero under conditions of the implicit- 
function theorem, and since the syslcm must be nonhomogcncous (why?), there should be 
a unique nontrivial solution to (8.28'). By Cntmcr’s rule, this solution may be expressed 
analytically as follows: 

f^ = T7T (■/ = !. 2,...,«) Lsee<5.18)] (8-29) 


By a suitable adaptation of this procedure, the partial derivatives of the implicit functions 
with respect to the other variables, x'i, ..., x m , can also be obtained. It is a nice feature of 
this procedure that, each time wc allow a particular x, variable to change, we can obtain in 
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one fell swoop the partial derivatives of all the implicit functions /'./" with respect 

to that particular*/ variable. 

Similarly, to the implicit-function rule (8.23) for the single-equation case, the procedure 
just described calls only for the use of the partial derivatives of the F functions—evaluated 

at the point (vio,_ ..., x, n <,) —in the calculation of the partial derivatives of the 

implicit functions ..., Thus the matrix equation (8.28') and its analytical solution 
(8.29) are in effect a statement of the simultaneous-equation version of the implicit- 
function rule. 

Note that the requirement |y| ^ 0 rules out a zero denominator in (8.29), just as the 
requirement F t , ^ 0 did in the implicit-function rule (8.23) and (8.23'). Also, the role 
played by the condition | J\ ^ 0 in guaranteeing a unique (albeit implicit) solution (8.25) to 
the general (possibly nonlinear) system (8.24) is very similar to the role of the nonsingu¬ 
larity condition \A\ ^ 0 in a linear system Ax — J. 


Example 5 


The following three equations 

xy - 0 F ] = (x, y,w; z) = 0 

y- w 3 - 3z= 0 F 2 = (x, y, w; z) = 0 

w 3 + z 3 - 2 zw =0 F 3 = (x, y, w; z) = 0 

are satisfied at point P: (x, y, w, z) = 4,1,1), The F' functions obviously possess con¬ 

tinuous derivatives. Thus, if the Jacobian | j\ is nonzero at point P, we can use the implicit- 
function theorem to find the comparative-static derivative (9x/dz), 

To do this, we can first take the total differential of the system: 

y dx + x dy - dw = 0 

dy- 3w 2 dw - 3 dz= 0 

(3w 2 - 2z) dw + {3z 2 - 2w) dz = 0 

Moving the exogenous differential (and its coefficients) to the right-hand side and writing 
in matrix form, we get 


y * 

-i 


’ dx~ 


0 

0 1 

-3w 2 


dy 1 

= 

3 

0 0 

(3w 2 -2z) 

i 

dw 


2w - 3z 2 


where the coefficient matrix on the left-hand side is the Jacobian 



n 

F 'y 



y 

X -1 

1/1 = 


F y 

F ly 

= 

;o 

1 —3w 2 



F i 



;o 

0 (3w 2 - 2z) 


y(3w 2 -2z) 


At the point P, the Jacobian determinant \J\=4& 0). Therefore, the implicit-function rule 
applies and 



/3x\ ’ 

i 


U 2 / 



y x -1 




0 

0 1 -3w 2 


l--\ 

= 

3 

0 0 (3w 2 -2z)j 


\liz) 


2w - 3z 2 


/ 0w\ 
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Example 6 


i 


Using Cramer's rule to find an expression for (9x/9z), we obtain 


0 X -1 


0 \ -1 

3 1 -3w 2 


3 1 -3 

2w-3z 2 0 (3 w 2 - 2 z ) 


-1 0 1 


|i — - ■ v*** -* i _ e 

dz) ~ 1/1 ~~ 4 


= 0 + (-3) 


} - 1 


"I 

0 1 

+ (-D- 

1 -3 

4 

4 


-3 -1 

16 + 16 
1 

~4 


Let the national-income model (7.17) be rewritten in the form 

y-c-/ 0 -G 0 = o 

C-a-fi{Y -T) = 0 (8.30) 

T-y-SY^Q 

If we take the endogenous variables (Y, C, T ) to be {y\, yi, yi), and take the exogenous 
variables and parameters (/ 0 , G 0 ,a,/3, y,S) to be (*i, x 2 , ..., x 6 ), then the left-side expres¬ 
sion in each equation can be regarded as a specific F function, in the form of P(Y, C, T; l 0 , 
G 0 , a, fi, y, 5). Thus (8.30) is a specific case of (8.24), with n = 3 and m = 6. Since the func¬ 
tions F\ F 2 , and F i do have continuous partial derivatives, and since the relevant Jacobian 
determinant (the one involving only the endogenous variables), 

9F 1 dF ] af 1 

dY 9C sr 

, 3P 2 9 F 2 dF 2 

1 dY dC ST 

9f 3 9F 3 9f 3 

dY dC or 

is always nonzero (both fi and 5 being restricted to be positive fractions), we can take Y, C, 
and T to be implicit functions of (/q, Co, a, fi, y, S) at and around any point that satisfies 
(8,30). But a point that satisfies (8.30) would be an equilibrium solution, relating to Y", C 
and T*. Hence, what the implicit-function theorem tells us is that we are justified in writing 

r = f\l 0 ,C 0 , a ,p,y,S) 

c*= f 2 (h,GQ, a ,p,y,&) 

r*= f i (lo, Co, a, fi, y, S) 

indicating that the equilibrium values of the endogenous variables are implicit functions of 
the exogenous variables and the parameters. 

The partial derivatives of the implicit functions, such as BY*fdla and 9 V*/9Go, are in the 
nature of comparative-static derivatives. To find these, we need only the partial derivatives 
of the F functions, evaluated at the equilibrium state of the model. Moreover, since n= 3, 
three of these can be found in one operation. Suppose we now hold all exogenous variables 


1 -1 0 

■fi 1 fi 

-S 0 1 


= 1 - fi + fi8 ( 8 . 31 ) 
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and parameters fixed except Gq. Then, by adapting the result in (8.28'), we may write the 
equation 


1 

-1 Ol 

~ar/a G 0 


V 

-0 

1 fi 

dC’/'dCo 

= 

0 

. -3 

0 ij 

JT'/d C 0 . 


0 


from which three comparative-static derivatives (all with respect to Co) can be calculated. 
The first one, representing the government-expenditure multiplier, will for instance come 
out to be 


-iv* 

<3 Co 


1 -1 0 
0 1 p 

0 0 1 

ill 


i 

i - & + pi 


[by (8.31)] 


This is. of course, nothing but the result obtained earlier in (7.19). Note, however, that in 
the present approach we have worked only with implicit functions, and have completely 
bypassed the step of solving the system (8.30) explicitly for P, C.", and T*. It is this par¬ 
ticular feature of the method that will now enable us to tackle the comparative statics ol 
general-function models which, by their very nature, can yield no explicit solution. 


EXERCISE 8.5 

1. For each F(x, y) = 0, find dy/dx for each of the following: 

(a) y-6x + 7 = 0 

(it). 3y+ 12x + 17 = 0 
(c) x z +6x-M-y^Q 

2. For each F{x, y) = 0 use the implicit-function rule tofind dy/dx: 

(a) F(x, y) = lx 2 + 2xy + 4y i = 0 

(b) F(x, y) = 12x 5 - 2y = 0 

(c) F(x,y) = 7x 2 +2xy 2 + 9y A =0 

(d) f(x, y) = 6x 3 - 3y = 0 

3. For each F (x, y, z) = 0 use the implicit-function rule to find ‘dy/dx and dy/dr. 

(a) F(x, y,z) = x 2 y? + z 2 + xyz=0 

(b) F(x, y,z) = x 3 z 2 + y l + 4xyz= 0 

(c) F(x, y, z) - lx 2 )? + xz 2 y 2 + y i zx i + y 1 ! = 0 

4. Assuming that the equation F (17, xi, x 2 _ x„) = 0 implicitly defines a utility func¬ 
tion U = f(xi, X 2 _ ,x„): 

(o) Find the expressions for SU/dxi, 9t//itx n , dxi/dxj, and <Jx4/9x„. 

(b) Interpret their respective economic meanings. 

5. For each of the given equations F ( y , x) = 0, is an implicit function y= f(x) defined 
around the point (y = 3, x = 1)? 

(a) x 2 - 2x 2 y+ Ixy 2 - 22 = 0 

(b) 2x 2 +4xy- / + 67 = 0 
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If your answer is affirmative, find dy/dx by the implicit-function rule, and evaluate it 
at the said point. 

6. Given x 2 + 3Ary^- 2yz+ y 2 - z 1 - 11 = 0, is an implicit function z= f(x, y) defined 
around the point (x = 7, y = 2,r=0)? if so, find Hz/ox and iiz/By by the implicit- 
function rule, and evaluate them at that point. 

7. 8y considering the equation F(y, x) = (x- y) 3 = 0 in a neighborhood around the 
point of origin, prove that the conditions cited in the implicit-function theorem are 
not in the nature of necessary conditions. 


8. If the equation F(x, y, f) = 0 implicitly defines each of the three variables as a 
function of the other two variables, and if all the derivatives in question exist, find the 

, , dz l)x by 

value of-■- - . 

Bx dy dz 

9. justify the assertion in the text that the equation system (8.28') must be nonhomo- 
geneous, 


10. From the national-income model (8.30), find the nonincome-tax multiplier by the 
implicit-function rule. Check your results against (7.20). 


Comparative Statics of General-Function Models 

When we first considered the problem of comparative-static analysis in Chap. 7, we dealt 
with the case where the equilibrium values of the endogenous variables of the model are ex¬ 
pressible explicitly in terms of the exogenous variables and parameters. There, the tech¬ 
nique of simple partial differentiation was all we needed. When a model contains functions 
expressed in the general form, however, that technique becomes inapplicable because of the 
unavailability of explicit solutions. Instead, a new technique must be employed that makes 
use of such concepts as total differentials, total derivatives, as well as the implicit-function 
theorem and the implicit-function rule. We shall illustrate this lirst with a market model, 
and then move on to national-income models. 

Market Model 

Consider a single-commodity market, where the quantity demanded Q.j is a function not 
only of price P but also of an exogenously determined income fy. The quantity supplied 
Q,. on (he other hand, is a function of price alone. If those functions arc not given in 
specific forms, our model may be written generally as follows: 

Qd = Qs 

Qa = D(P. }'„) CdD/JP < 0: HD/dY,, > 0) (8.32) 

Qs = S(P) (dS/ilP > 0) 

Both the D and S functions are assumed to possess continuous derivatives or, in other 
words, to have smooth graphs. Moreover, in order to ensure economic relevance, we 
have imposed delinite restrictions on the signs of these derivatives. By the restriction 
dS/dP > 0. the supply function is stipulated to be strictly increasing, although it is per¬ 
mitted to be either linear or nonlinear. Similarly, by the restrictions on the two partial 
derivatives of the demand function, we indicate that it is a strictly decreasing function of 
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price but a strictly increasing function of income. For notationa! simplicity, the sign 
restrictions on the derivatives of a function arc sometimes indicated with + or - signs 
placed directly underneath the independent variables. Thus the D and S functions in (8.32) 
may alternatively be presented as 

Qj = D(P,Y„) Q, = S{P) 

- + + 

These restrictions serve to confine our analysis to the “normal” case we expect to 
encounter. 

In drawing the usual type of two-dimensional demand curve, the income level is 
assumed to be held fixed. When income changes, it will upset a given equilibrium by caus¬ 
ing a shift of the demand curve. Similarly, in (8.32), T () can cause a disequilibrating change 
through the demand function. Here, T n is the only exogenous variable or parameter; thus 
the comparative-static analysis of this model will be concerned exclusively with how a 
change in K 0 will affect the equilibrium position of the model. 

The equilibrium position of the market is defined by the equilibrium condition 
Qd — (?,, which, upon substitution and rearrangement, can be expressed by 

D(P.y 0 )-S(P)=O (8.33) 

Even though this equation cannot be solved explicitly for the equilibrium price P\ we shall 
assume that there docs exist a static equilibrium- for otherwise there would be no point in 
even raising the question of comparative statics. From our experience with specific- 
function models, we have learned to expect P* to be a function of the exogenous variable To; 

P' = P*(Y») (8-34) 

But now we can provide a rigorous foundation for this expectation by appealing to the 
implicit-function theorem. Inasmuch as (8.33) is in the form of F{ P. To) = 0. the satisfac¬ 
tion of the conditions of the implicit-function theorem will guarantee that every value of To 
will yield a unique value of P* in the neighborhood of a point satisfying (8.33), that is, in 
the neighborhood of an (initial or “old”) equilibrium solution. In that case, we can indeed 
write the implicit function P' = P'( To) and discuss its derivative. dP'/dYn —the very 
comparative-static derivative we desire -which is known to exist. Let us. therefore, chock 
those conditions. First, the function P(P, To) indeed possesses continuous derivatives; 
this is because, by assumption, its two additive components D(P , Tj) and S{P) have 
continuous derivatives. Second, the partial derivative of F with respect to P. namely, 
F P - BD/flP - dS/dP. is negative, and hence nonzero, no matter where it is evaluated. 
Thus the implicit-function theorem applies, and (8.34) is indeed legitimate. 

According to the same theorem, the equilibrium condition (8.33) can now be taken to be 
an identity in some neighborhood of the equilibrium solution. Consequently, we may write 
the equilibrium identity 

l)(P", Ty) - S{P*) = 0 [Excess demand = 0 in equilibrium] .g ^ 

I-VW,,) 

It then requires only a straight application of the implicit-function rule to produce the 
comparative-static derivative, dP*/dYa. For visual clarity, we shall from here on enclose 
comparative-static derivatives in parentheses to distinguish them from the regular 
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derivative expressions that merely constitute part of the model specification. The result 
from the implicit-function rule is 


dP* 

d% 


dF/dYn BD/3 Y q 

dF/tiP' - - dS/dP' 


(8.36) 


In this result, the expression HD/dP* refers to the derivative dD/ftP evaluated at the ini¬ 
tial equilibrium, i.e., at P = a similar interpretation attaches to dS/tIP*. In fact, 
r) D/a Yd must be evaluated at the equilibrium point as well. By virtue of the sign specifica¬ 
tions in (8.32), (dP x /dYv) is invariably positive. Thus our qualitative conclusion is that an 
increase (decrease) in the income level will always result in an increase (decrease) in the 
equilibrium price. If the values which the derivatives of the demand and supply functions 
take at the initial equilibrium are known. (8.36) will, of course, yield a quanUialive con¬ 
clusion also. 

This discussion of market adjustment is concerned with the effect of a change in hj on 
P*. Is it possible also to find out the effect on the equilibrium quantity' Q\= Q* = £*)? 
The answer is yes. Since, in the equilibrium state, we have Q* = S(P*), and since 
P* = P*( Kq), we may apply the chain rule to get the derivative 


dQ*\ dS /dP A \ 
cIYd ) “ dp* UrJ ^ 


dS 

since - > 0 
dP * 


(8.37) 


Thus the equilibrium quantity is also positively related to fo in this model. Again, (8.37) 
can supply a quantitative conclusion if the values which the various derivatives take at the 
equilibrium are known. 

The results in (8.36) and (8.37), which exhaust the comparative-static contents of the 
model (since the latter contains only one exogenous and two endogenous variables), are not 
surprising. In fact, they convey no more than the proposition that an upward shift of the de¬ 
mand curve will result in a higher equilibrium price as well as a higher equilibrium quan¬ 
tity. This same proposition, it may seem, could have been arrived at in a flash from a sim¬ 
ple graphic analysis! This sounds correct, hut one should not lose sight of the far. far more 
general character of the analytical procedure wc have used here. The graphic analysis is by 
its very nature limited lo a specific set of curves (the geometric counterpart of a specific set 
of functions); its conclusions are therefore, strictly speaking, relevant and applicable to 
only that set of curves. In sharp contrast, the formulation in (8.32), simplified as it is, cov¬ 
ers the entire set of possible combinations of negatively sloped demand curves and posi¬ 
tively sloped supply curves. Thus it is vastly more general. Also, the analytical procedure 
used here can handle many problems of greater complexity that would prove to be beyond 
the capabilities of the graphic approach. 


Simultaneous-Equation Approach 

The analysis of model (8.32) was carried out on the basis of a single equation, namely, 
(8.35). Since only one endogenous variable can fruitfully be incorporated into one equa¬ 
tion, the inclusion of P* means the exclusion of Q*. Asa result, we were compelled to find 
{dP*[dY$) first and then to infer (dQ*[dY§) in a subsequent step. Now wc shall show how 
P* and Q* can be studied simultaneously. As there are two endogenous variables, we shall 
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accordingly set up a two-equation system. First, letting Q = Qj — Q< in (8.32) and rear¬ 
ranging, we can express our market model as 


F l {P, Q\ Jo) = 0{P, y 0 ) - ^ = 0 
F\P,Q;.h) = S{P)-Q= 0 


(8.38) 


which is in the form of (8.24), with n = 2 and m = l. It becomes of interest, once again, to 
check the conditions of the implicit-function theorem. First, since the demand and supply 
functions are both assumed to possess continuous derivatives, so must the functions F 1 and 
F 2 . Second, the endogenous-variable Jacobian (the one involving P and Q) indeed turns 
out to be nonzero, regardless of where it is evaluated, because 


Ul = 


op 1 

OP 1 


W 

op 

HQ 


3 F 

df 2 

dF 2 


dS 

dP 

i)P 




dS 

dP 


dD 

Jp 


> 


0 (8.39) 


Hence, if an equilibrium solution ( P*, Q') exists (as we must assume in order to make it 
meaningful to talk about comparative statics), the implicit-function theorem tells us that wc 
can write the implicit functions 


P* = PH h) and Q* = Q\Y 0 ) (8.40) 


even though we cannot solve for P* and Q* explicitly. These functions are known to have 
continuous derivatives. Moreover, (8.38) will have the status of a pair of identities in some 
neighborhood of the equilibrium stale, so that we may also write 

D{P\ Y 0 ) -e* = 0 [t.e., F X (P\ Q*\ Y 0 ) = 0] 

5 ( P*,_2’= o [i.e., F 2 {P*s 


From these, ( dP*/dY 0 ) and {dQ'/dYu) can be found simultaneously by using the implicit- 
function rule (8.28'). 

In the present context, with P 1 and F 1 as defined in (8.41), and with two endogenous 
variables P* and Q‘ and a single exogenous variable Yi), the implicit-function rule takes the 
specific form 


'&f 1 

3F' ' 


r/^y 


3F' ” 


dQ* 


\<1yJ 


~9Fo' 

3F 1 

t)P 2 


(W\ 


3F 2 

_9P* 

ae*. 


_\d y J_ 


L 9r 0 J 


Note that the comparative-static derivatives are writien here with the symbol d rather than 
9, because there is only one exogenous variable in the present problem. More specifically, 
(he last equation can be expressed as 


’ 31) 

-i 


[m 1 


3L)' 

dp* 


\dYo) 

— 

~3Y ti 

dS 

-i 


f" e ‘i 


0 

1— 

* 



\drj_ 
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By Cramer's rule, and using (8.39), we then find the solution to be 


dP* 


dY 0 


BY a 

0 -1 

\J\ 


dD 

\j\ 


dp_ 

dYt) 


d d dD 

IF ~W fl 

f- 0 

df _ 

\J\ 


ds ao 


\J\ 


( 8 . 42 ) 


where all the derivatives of the demand and supply functions (including those appearing in 
the Jacobian) are to be evaluated at the initial equilibrium. You can check that the results 
just obtained are identical with those obtained earlier in (8.36) and (8.37), by means of the 
single-equation approach. 

Instead of directly applying the implicit-function rule, we can also reach the same result 
by fiTSt differentiating totally each identity in (8.41) in turn, to get a linear system of equa¬ 
tions in the variables dP* and dQ*\ 


dP* 

dS 

IF* 


dP* 
dP* 


dQ' = 


dD 


dY 0 


dQ* = 0 


and then dividing through by dY$ ^ 0, and interpreting each quotient of two differentials 
as a derivative. 


Use of Total Derivatives 

In both the single-equation and the simultaneous-equation approaches illustrated above, we 
have taken the total differentials of both sides of an equilibrium identity and then equated 
the two results to arrive at the implicit-function rule. Instead of taking the total differen¬ 
tials, however, it is possible to take, and equate, the total derivatives of the two sides of the 
equilibrium identity with respect to a particular exogenous variable or parameter. 

In the single-equation approach, for instance, the equilibrium identity is 

D(P\ Y 0 ) - $(/>*) = 0 [from (8.35)] 

where P* = P*(Y o) [from (8.34)] 

Taking the total derivative of the equilibrium identity with respect to To—which takes into 
account the indirect as well as the direct effects of a change in To—will therefore give us 
the equation 


DD 


dD 

ds 

/dP •' 

3 P* 1 

[clYo) 

+ dY 0 

dp * * 

\dY 0 

/ indirect effect) 

/direct effect) 

/ indirect effect 

\ of Yd oji 1) ) 

\ of h on D ) 

\ of Y\i on S 


When this is solved for (dP* jdY^f the result is identical with the one in (8.36). 
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FIGURE 87 



In the simultaneous-equation approach, on the other hand, there is a pair of equilibrium 
identiti es: 

D(P\ r 0 ) - £?* = 0 

S( P*) — = 0 [from (8.41)] 

where P* ^ PVo) Q* = Q*W o) [from (8.40)) 


The various effects of Y(, are now harder to keep track of, but with the help of the channel 
map in Fig. 8.7, the pattern should become clear. This channel map tells us, for instance, 
that when differentiating the D function with respect to K 0 , we must allow for the indirect 
effect of Ko upon D through P*. as well as the direct effect of To (curved arrow). In differ¬ 
entiating the S function with respect to F 0 . on the other hand, there is only the indirect effect 
(through P *) to be taken into account. Thus the result of totally differentiating the two iden¬ 
tities with respect to T 0 is, upon rearrangement, the following pair of equations: 


dD UP* 
dP* 1,7^ 
dS UP' 
JP*\dYo 


dQ* 

dQ* 


dD 

d 


= 0 


These are, of course, identical with the equations obtained by the total-differential method, 
and they lead again to the comparative-static derivatives in (8,42). 


National-Income Model (IS-LM) 

A typical application of the implicit-function theorem is a general-functional form of the 
IS-LM model. 1 Equilibrium in this macroeconomic model is characterbed by an income 
level and interest rates that simultaneously produce equilibrium in both the goods market 
and the money market. 

A goods market is described by the following set of equations: 

Y = C + I + G C = C{Y-T) G = 

1 = l(r) T — T(Y) 

Y is the level of gross domestic product (GDP), or national income. In this form of the 
model, Y can also be thought of as aggregate supply. C, I, G, and T are consumption, 
investment, government spending, and taxes, respectively. 


1 IS stands for "investment equals savings" and LM stands for "liquidity preference equals money 
supply." 
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1. Consumption is assumed to be a strict!}/ increasing function of disposable income 
(Y - T). If we denote disposable income as Y d = Y - T , then the consumption func¬ 
tion can be expressed as 

C = C{Y li ) 

where dC/dY‘ l = C'(Y d ) is the marginal propensity to consume (0 < C'{Y d ) < I). 

2. Investment spending is assumed to be a strictly decreasing function of the rate of 
interest, r: 

y = m < o 

dr 

3. The public sector is described by two variables: government spending (G> and taxes (7"). 
Typically, government spending is assumed to be exogenous (set by policy) whereas taxes 

^ 7 1 

are assumed to be an increasing function of income. — = T'{ Y) is the marginal tax 
rate(0 < T'(Y) < l). dY 

If we substitute the functions for C. /, 6' into the first equation Y = C + / + (7. we get 

y = C(y-7 , (y)) + /(r) + G 0 (IS curve) 

which gives us a single equation with two endogeneous variables: Y and r. This equation 
gives us all the combinations of Y and r that produce equilibrium in the goods market. 
This equation implicitly dclincs the fS curve. 

Slope of the fS Curve 

If we rewrite the IS equation, which is in the nature of an equilibrium identity, 

Y-C(Y d )-l(r)-C 0 ^\) 
then the total differential with respect to Y and r is 

dY - C'(7 d )[l - r'(7}] dY - !’(r) dr = 0 
dY J 

Note: — = 1 - T'(Y) 

We can rearrange the dY and dr terms to get an expression for the slope of the IS curve: 

dr 1 -qy rf )[l -T\Y)] „ 

dY I'{r) 

Given the restrictions placed on the derivatives of C, I. and T. we can easily verify that the 
slope of the IS curve is negative. 

The money market can be described by the following three equations: 

M <i = L(Y,r ) [money demand] where Ly > 0 and L r < 0 

AT = Mq [money supply] 

where the money supply is assumed to be exogenously determined by the central monetary 
authority, and 


M d — .'Vf 1 [equilibrium condition] 
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Substituting the first two equations into the third, we get an expression that implicitly 
defines the LM curve, which is again in the nature of an equilibrium identity. 

L{Y,r)=M s 0 


Slope of the LM Curve 

Since this is an equilibrium identity, we can take the total differential with respect to the 
two endogenous variables, Y and r: 

Ly (IY 4- Lr dr - 0 

which can be rearranged to give us an expression for the slope of the LM curve 

dY L r 

Since Ly > 0 and L r < 0. we can determine that the slope of the LM curve is positive. 

The simultaneous macroeconomic equilibrium state of both the goods and money mar¬ 
kets can be described by the following system of equations: 

Y ^C(Y d )+ Hr) + G [} 

L(Y,r) = M i 0 

which implicitly define the two endogenous variables, Y and r, as functions of the exoge¬ 
nous variables. Go and . Taking the total differential of the system, we get 

dY - C'O’^Xl - T\Y)} dY - l\r) dr = dG„ 

L Y dY+L r dr=dMZ 


or. in matrix form. 


'l 

-C( K rf )[l - T'{Y)] 

~l'(r )' 

'dY' 


' dCo' 


Ly 

Lr 

dr 


A K. 


The Jacobian determinant is 


1 -C'(r)[l - T'(V)] 

Ly 


l'(r) 

Lr 


= {1 - - T'(Y)]\L r 4 Ly l\r) < 0 


Since |./| # 0, this system satisfies the conditions of the implicit-function theorem and 
the implicit functions 

r = r(fi 0 .A/g) 

and 


r” = r*(G(i. Afy) 

can be written even though wc are unable to solve for Y‘ and r* explicitly. Even though 
wc cannot solve for Y" and r‘ explicitly, we can perform comparative-static exercises to 
determine the effects of a change of one of the exogenous variables Mf) on the equi¬ 
librium values of Y* and r*. Consider the comparative-static derivatives 9Y*/dG{ t and 
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dr^/dOy which we shall derive by applying the implicit-function theorem to our system oh 
total differentials in matrix form 


‘l 

-C(Y d )[\-r'(Y)} 

-IV)' 

\iy' 


dG {} 


Ly 

Lr 

_ dr _ 


-1 

fr, ;* 

5 

_1 


First we scldM^ = 0 and divide both sides by ilG( h 


1 -r-(l - T') -l\r) 
L) L r 



~ dY* ~ 



-i 

dG 0 


'1' 


dr* 


0 


1 

o 

“T3 

l 




Using Cramer's rule, we obtain 


dr 


1 -V 

0 Lr 


dG 




./ 


L r 

= — = - > 0 

\J\ c 


and 


dr* 

dG(i 


i -c v -(i-r) i 

Ly o 


j 


zh 

i./i 


= - >o 


From the implicit-function theorem, these ratios of differentials, dY'/dCy and dr*/dGo, 
can be interpreted as partial derivatives. 


»r(C 7 o, Mg) 


and 


dr'{Go, 


9Go 

which are our desired comparative-static derivatives. 


Extending the Model: An Open Economy 

One property of a model that economists look for is its robustness; the ability of the model 
to be applied to different settings. At this point we will extend the basic model to incorpo¬ 
rate the foreign sector. 

1. AW exports. Let X denote exports, M denote imports, and E denote the exchange rate 
(measured as the domestic price of foreign currency). Exports are an increasing function 
of the exchange rate. 

X=X{E) where X\E) > 0 

Imports are a decreasing function of the exchange rate but an increasing function of 
income. 


M - M(Y, E) where My > 0, M E < 0 

2. Capitalflows. The net flow of capital into a country is a function of both the domestic in¬ 
terest rate r and world interest rate r w , Let K denote net capital inflow such that 

K — Kir. r lv ) where K r > 0, K r < 0 
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3. Balance of payments. The inflows and outflows of foreign currency for a country are 
typically separated into two accounts: current account (net exports of goods and ser¬ 
vices) and the capital account (the purchasing of foreign and domestic bonds). Together, 
the two accounts make up the balance of payments. 

BP = current account + capital account 
= [X(E)~ M(Y,E)\ + K(r,r w ) 

Under flexible exchange rates, the exchange rate adjusts to keep the balance of pay¬ 
ments equal to zero. Having the balance of payments equal to zero is the equivalent to say¬ 
ing the supply of foreign currency equals the demand for foreign currency by a country. 1 


Open-Economy Equilibrium 

Equilibrium in an open economy is characterized by three conditions: aggregate demand 
equals aggregate supply; the demand for money equals the supply of money; the balance of 
payments equals zero. Adding the foreign sector to our basic model gives us the following 
system of three equations 

Y = C(F') + I(r) T G„ + X(E) - M( Y, E) 

L{Y,r) — Afy 

X(E)-M(Y, E) + K(r,rO = 0 

Since we have three equations, we need three endogenous variables, which are Y, r. and 
E. The exogenous variables now become Co. K'i, ar| d >' w . Rewriting the system as equilib¬ 
rium identities F' = 0, F 2 = 0, F 3 = 0 allows us to find the Jacobian: 

Y - C(Y d ) - l(r) - Co - X(E) + M(Y, E) = 0 

L(Y,r)-M^ 0 
X(E)-M(Y,E) + K(r,rJ = Q 


\ - C ■ ( \ - T 1 ) + My Mf.-X' 

L r L r 0 

-M y K, X' - M e 

Using Laplace expansion down the third column, we obtain 

1 -C'-(l - T') + M y -/' 
/. y L r 


J\ = (M e -X') 


Ly Ly 

-My Ky 


+ (X' - Mg) 

= (Me - X')(L Y Kr + L r M y ) + (X‘ - M £ ){[\ -C-(l-T , ) + M r ]L r + l'L y 
= (Me - X')(Ly(K r - 1') + Lr[C'( 1 - n - lj} 


Given the assumptions about the signs of the partial derivatives and the restriction that 
0 < C -(1 — T') < 1, we can determine that \J\ < 0. Therefore, we can write the implicit 
functions 

r = r(c 0 ,Af£,>v) 

r* = r*(Go, 

£* = r(0’o,Wo.M 


1 Under a fixed exchange rate regime, the balance cf payments is not necessarily zero. In such an 
event, any surpluses or deficits are recorded as change of official settlements. 
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Taking the total differential of the system of equations and writing it in matrix form 


"l -C'-(l - T) + Mr -r M h ;-X'~ 


'dY*' 


dGt) 

Ly L r 0 


dr * 

= 


=5 

l 

1 


dE * 


-Kr* dr w _ 


will allow us to carry out a series of comparative-static exercises. Let’s consider the impact 
of change in the world with interest rates r w on the equilibrium values of T, r, and E. Set¬ 
ting dGi) = dMl = 0 and dividing both sides by dr w gives us 


1 -C ■ (1 - T') + My 
Ly 
-My 


-I' Me - X' 
L r 0 
K r X' - M e 


dY* 
dr w 
dr * 
dr w 
dE ’ 


0 

0 


Using Cramer’s rule, we obtain the comparative-static derivatives 


and 


dr* 

8r„ 



0 

-r 

; - X’ 




0 

Lr 

0 



a y* 

-Kr* 

Kr X' 

-M e 

i-Kr*. 

)(-Lr)(M E -X') n 

8r w 


\J\ 



\A 

1 -C‘ 

(1 - T') + My 

0 

Me-X' 



Ly 


0 

0 



-My 


-Kr*. 

X'-M e 

K Fn {-L Y )(M E -X') 



\J 1 



\J\ 


> 0 


and 


dE* 

8r w 


\ - C' ■ (\ - T’) + My -/' 0 

Ly L r 0 

-My K r -K rw 

\J\ 


— K rw {[\-C-(\-T') + M Y ]L r + L Y r} 

\J\ 


At this point you should compare the results we have derived to the macroeconomic 
principles. Intuitively, a rise in the world interest rate should lead to an increase in capital 
outflows and a depreciation of the domestic currency, This, in turn, will lead to an increase 
in net exports and income. The increase in domestic income will cause an increase in 
money demand, putting upward pressure on domestic interest rates. This result is illustrated 
graphically in Fig. 8.8 where a rise in world interest rates leads to a rightward shift of the 
IS curve. 



216 Part Three Comparative-Static Analysis 


FIGURE 8.8 



Summary of the Procedure 

In the analysis of the general-function market model and national-income model, it is not 
possible to obtain explicit solution values of the endogenous variables. Instead, wc rely on 
the implicit-function theorem to enable us to write the implicit solutions such as 

P* = P*(fo) and r* = r*(G 0 ,A^) 

Our subsequent search for the comparative-static derivatives such as ( dP*/tlYo ) and 
(d>'73Gft) then rests for its meaningfulncss upon the known fact—thanks again to the 
implicit-function theorem- that the P* andr* functions do possess continuous derivatives. 

To facilitate the application of that theorem, we make it a standard practice to write the 
equilibrium condition(s) of the model in the form of (8.19) or (8.24). We then check 
whether (]) the F funetion(s) have continuous derivatives and (2) the value of F y or the 
endogenous-variable Jacobian determinant (as the case may be) is nonzero at the initial 
equilibrium of the model. However, as long as the individual functions in the model have 
continuous derivatives- an assumption which is often adopted as a matter of course in 
general-function models—the first condition is automatically satisfied. As a practical mat¬ 
ter, therefore, it is needed only to check the value of F y or the endogenous-variable 
Jacobian. And if it is nonzero at the equilibrium, we may proceed at once to the task of find¬ 
ing the comparative-static derivatives. 

To that end, the implicit-function rule is ofhelp. For the single-equation case, simply set 
the endogenous variables equal to its equilibrium value (c.g., set P = P*) in the equilib¬ 
rium condition, and then apply the rule as stated in (8.23) to the resulting equilibrium iden¬ 
tity. For the simultaneous-equation case, we must also first set ail endogenous variables 
equal to their respective equilibrium values in the equilibrium conditions. Then we can 
either apply the implicit-function rule as illustrated in (8.29) to the resulting equilibrium 
identities, or arrive at the same result by carrying out the several steps outlined as fol lows: 

1. Take the Total differential of each equilibrium identity in turn. 

2. Select one, and only one, exogenous variable (say, Ah) as the sole disequilibrating fac¬ 
tor, and set the differentials of all other exogenous variables equal to zero. Then divide 
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all remaining terms in each identity by dX {) , and interpret each quotient of two differ¬ 
entials as a comparative-static derivative—a partial one if the model contains two or 
more exogenous variables." 

3. Solve the resulting equation system for the comparative-static derivatives appearing 
therein, and interpret their economic implications. In this step, if Cramer’s rule is used, 
we can take advantage of the fact that, earlier, in checking the condition \ J\ ^ 0. wc 
have in fact already calculated the determinant of the coefficient matrix of the equation 
system now being solved. 

4. For the analysis of another disequilibrating factor (another exogenous variable), if any, 
repeat steps 2 and 3. Although a different group of comparative-static derivatives will 
emerge in the new equation system, the coefficient matrix will be the same as before, 
and thus the known value of |./| can again be put to use. 

Given a model with m exogenous variables, it will take exactly m applications of steps 1,2. 
and 3 to catch all the comparative-static derivatives there are. 

f Instead of taking steps 1 and 2, we may equivalently resort to the total-derivative method by 
differentiating (both sides of) each equilibrium identity totally with respect to the selected exogenous 
variable. In so doing, a channel map will prove to be of help. 


EXERCISE 8.6 

1. Let the equilibrium condition for national income be 
S(n + f(f)=/(f) + C 0 (S\T\I‘> 0; S'+r>/') 

where 5, Y, T;l, and C stand for saving, national income, taxes, investment, and gov¬ 
ernment expenditure, respectively. All derivatives are continuous. 

(a) Interpret the economic meanings of the derivatives 5', T', and /'. 

(b) Check whether the conditions of the implicit-function theorem are satisfied. If so, 
write the equilibrium identity. 

(c) Find (df*./dCo) and discuss its economic implications. 

2. Let the demand and supply functions for a commodity be 

Qi^DW.Yt) (D o <0- D^> 0) 

Q 5 = S(P, r 0 ) (Sp > 0; 5t 0 < 0) 

where Yq is income and f 0 is the tax on the commodity. All derivatives are continuous. 

(a) Write the equilibrium condition in a single equation. 

(b) Check whether the implicit-function theorem is applicable. If so, write the equilib¬ 
rium identity. 

(c) Find (dP'/afo) and (UP'/tlTo), and discuss their economic implications. 

(d) Using a procedure similar to (8.37), find (HQ'/clfo) from the supply function and 
(dQ'/dTo) from the demand function. (Why not use the demand function for the 
former, and the supply function for the latter?) 

3. Solve Prob. 2 by the simultaneous-equation approach. 

4. Let the demand and supply functions for a commodity be 
Q d =D(P,to) and Qj:=Qso 
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where t 0 is consumers' taste for the commodity, and where both partial derivatives are 
continuous. 

(a) What is the meaning of the - and + signs beneath the independent variables P and 

(b) Write the equilibrium condition as a single equation. 

(c) Is the implicit-function theorem applicable? 

(d) How would the equilibrium price vary with consumers' taste? 

5. Consider the following national-income model (with taxes ignored): 

Y-C(Y)-I{i)~ C 0 = 0 (0 < C' < 1; /'< 0) 

kY + £(/) - M s o = 0 (it = positive constant; L' < 0) 

(o) Is the first equation in the nature of an equilibrium condition? 

(b) What is the total quantity demanded for money in this model? 

(c) Analyze the comparative statics of the model when money supply changes 
(monetary policy) and when government expenditure changes (fiscal policy). 

6. In Prob. 5, suppose that while the demand for money still depends on Y as specified, it 
is now no longer affected by the interest rate. 

(a) How should the model statement be revised? 

(b) Write the new Jacobian, call it |/ Is \j |' numerically (in absolute value) larger or 
smaller than \) |? 

(c) Would the implicit-function rule still apply? 

(d) Find the new comparative-static derivatives. 

(e) Comparing the new (by*/3Co) with that in Prob.5, what can you conclude about 
the effectiveness of fiscal policy in the new model where Vis independent of f? 

(0 Comparing the new (9P/9M s o) with that in Prob. 5, what can you say about the 
effectiveness of monetary policy in the new model? 


8.7 Limitations of Comparative Statics _ 

Comparative statics is a useful area of study, because in economics we are often interested 
in finding out how a disequilibrating change in a parameter will affect the equilibrium state 
of a model. It is important to realize, however, that by its very nature comparative statics 
ignores the process of adjustment from the old equilibrium to the new and also neglects the 
length of time required in that adjustment process. As a consequence, it must of necessity 
also disregard the possibility that, because of the inherent instability of the model, the new 
equilibrium may not be attainable ever. The study of the process of adjustment per se be¬ 
longs to the field of economic dynamics. When we come to that, particular attention will be 
directed toward the manner in which a variable wil 1 change over lime, and explicit consid¬ 
eration will be given to the question of stability of equilibrium. 

The important topic of dynamics, however, must wait its turn. Meanwhile, in Part 4. we 
shall undertake to study the problem of optimization, an exceedingly important special 
variety of equilibrium analysis with attendant comparative-static implications (and compli¬ 
cations) ofits ow-n. 
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When we first introduced the terra equilibrium in Chap. 3, we made a broad distinction 
between goal and nongoal equilibrium. In the latter type, exemplified by our study of mar¬ 
ket and national-income models, the interplay of eertain opposing forces in the model— 
e,g., the forces of demand and supply in the market models and the forces of leakages and 
injections in the income models—dictates an equilibrium state, if any, in which these 
opposing forces are just balanced against each other, thus obviating any further tendency 
to change. The attainment of this type of equilibrium is the outcome of the impersonal bal¬ 
ancing of these forces and does not require the conscious effort on the pari of anyone to 
accomplish a specified goal. True, the consuming households behind the forces of demand 
and the firms behind the forces of supply are each striving for an optimal position under 
the given circumstances, bul as far as the market itself is concerned, no one is aimi ng at 
any particular equilibrium price or equilibrium quantity (unless, of course, the govern¬ 
ment happens to be trying to peg the price). Similarly, in national-income determination, 
the impersonal balancing of leakages and injections is what brings about an equilibrium 
state, and no conscious effort at reaching any particular goal (such as an attempt to alter 
an undesirable income level by means of monetary or fiscal policies) needs to be involved 
at all. 

In the present part of the book, however, our attention wil I be turned to the study of goal 
equilibrium, in which ihe equilibrium state is defined as the optimum position for a given 
economic unit (a household, a business firm, or even an entire economy) and in which the 
said economic unit will be deliberately striving for attainment of that equilibrium. As a 
result, in this context—bul only in this context -our earlier warning that equilibrium does 
not imply desirability becomes irrelevant and immaterial. In this part of the book, our pri¬ 
mary focus will be on the classical techniques for locating optimum positions—those using 
differential calculus. More modern developments, known as mathematical programming, 
will be discussed in Chap. 13, 
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9.1 Optimum Values and Extreme Values _ 

Economics is essentially a science of choice. When an economic project is to be carried 
out, such as the production of a specified level of output, there are normally a number of 
alternative ways of accomplishing it. One (or more) of these alternatives wi II, however, be 
more desirable than others from the standpoint of some criterion, and it is the essence of 
the optimization problem to choose, on the basis of that specified criterion, the best alter¬ 
native available. 

The most common criterion of choice among alternatives in economics is the goal of 
maximizing something (such as maximizing a firm’s profit, a consumer's utility, or the rale 
of growth of a firm or of a country’s economy) or of minimizing something (such as mini¬ 
mizing the cost of producing a given output), Economically, we may categorize such max¬ 
imization and minimization problems under the general heading of optimization, meaning 
“the quest for the best." From a purely mathematical point of view, however, the terms max¬ 
imum and minimum do not carry with them any connotation of optimality. Therefore, the 
collective term for maximum and minimum, as mathematical concepts, is the more matter- 
of-fact designation extremum, meaning an extreme value. 

In formulating an optimization problem, the first order of business is to delineate an 
objective function in which the dependent variable represents the object of maximization 
or minimization and in which the set of independent variables indicates the objects whose 
magnitudes the economic unit in question can pick and choose, with a view to optimizing. 
We shall therefore refer to the independent variables as choice variables? The essence of 
the optimization process is simply to find the set of values of the choice variables that will 
lead us to the desired extremum of the objective function. 

For example, a business firm may seek to maximize profit n , that is, to maximize the dif¬ 
ference between total revenue R and total cost C Since* within the framework of a given 
state of technology and a given market demand for the firm’s product, R and C are both 
functions of the output level Q , it follows that n is also expressible as a function of Q: 

tz(Q) = R(Q) - C(Q) 

This equation constitutes the relevant objective function, with n as the object of maxi¬ 
mization and Q as the (only) choice variable. The optimization problem is then that of 
choosing the level of Q that maximizes n. Note that while the optimaI level of n is by 
definition its maximal level, the optimal level of the choice variable ^ is itself not required 
to be either a maximum or a minimum. 

To cast the problem into a more general mold for further discussion (though still con¬ 
fining ourselves to objective functions of one variable only), lei us consider the general 
function 

y = /(*) 

and attempt to develop a procedure for fi nding the level of x that will maximize or minimize 
the value ofv. It will be assumed in our discussion that the function/is continuously 
differentiable. 

f They can also be called decision variables, or policy variables. 
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9,2 Relative Maximum and Minimum: First-Derivative Test 

Since the objective function y - fix) is stated in the general form, there is no restriction as 
to whether it is linear or nonlinear or whether it is monotonic or contains both increasing and 
decreasing parts. From among the many possible types of function compatible with the 
objective-function form discussed in Sec. 9.1, we have selected three specific cases to be 
depicted in Fig. 9.1. Simple as they may be, the graphs in Fig. 9.1 should give us valuable in¬ 
sight into the problem of locating the maximum or minimum value of the function y = fix), 

Relative versus Absolute Extremum 

If the objective function is a constant function, as in Fig. 9.1a, all values of the choice 
variable x will result in the same value of y, and the height of each point on the graph of 
the function (such as A or B or C) may be considered a maximum or, for that matter, a 
minimum—or, indeed, neither. In this case, there is in effect no significant choice to be 
made regarding the value of x for the maximization or minimization ofy. 

In Fig. 9.1A, the function is strictly increasing, and there is no finite maximum if the set 
of nonnegative real numbers is taken to be its domain, However, we may consider the end 
point D on the left (the y intercept) as reoresenting a minimum; in fact, it is in this case the 
absolute (or global) minimum in the range of the function. 

The points E and F in Fig. 9.1c, on the other hand, are examples of a relative (or local) 
extremum, in the sense that each of these points represents an extremum in the immediate 
neighborhood of the point only. The fact that point Fisa relative minimum is, of course, no 
guarantee that it is also the global minimum of the function, although this may happen to 
be the case. Similarly, a relative maximum point such as E may or may not be a global max¬ 
imum. Note also that a function can very well have several relative extrema, some of which 
may be maxima while others are minima. 

In most economic problems that we shall be dealing with, our primary, if not exclusive, 
concern will be with extreme values other than end-point values, for with most such prob¬ 
lems the domain of the objective function is restricted to be the set of nonnegative real 
numbers, and thus an end point (on the left) will represent the zero level of the choice vari¬ 
able, which is often of no practical interest. Actually, the type of function most frequently 
encountered in economic analysis is that shown in Fig. 9.1c, or some variant thereof that 
contains only a single bend in the curve. We shall therefore continue our discussion mainly 
with reference to the search for relative extrema such as points E and F. This will, however, 
by no means foreclose the knowledge of an absolute maximum if we want it, because an 
absolute maximum must be either a relative maximum or one of the end points of the 
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FIGURE 9.2 



(«) 



function. Thus if we know all the relative maxima, it is necessary only to select the largest 
of these and compare it with the end points in order to determine the absolute maximum. 
The absolute minimum of a function can be found analogously. Hereafter, the extreme val¬ 
ues considered will be relative or local ones, unless indicated otherwise. 

First-Derivative Test 

As a matter of terminology, from now on we shall refer to the derivative of a function 
alternatively as its first derivative (short for first-order derivative). The reason for this will 
become apparent shortly. 

Given a function y = f(x) s the first derivative /'(*) plays a major role in our search for 
its extreme values, This is due to the fact that, if a relative extremum of the function occurs 
at .t = .Vo, then either (1) /'(*o) does not exist, or (2) f’(x o) = 0. The first eventuality is 
illustrated in Fig. 9.2a, where both points A and B depict relative extreme values of \\ and 
yet no derivative is defined at either of these sharp points. Since in the present discussion 
we are assuming that y = f(x) is continuous and possesses a continuous derivative, how¬ 
ever, we are in effect ruling out sharp points. For smooth functions, relative extreme values 
can occur only where the first derivative has a zero value. This is illustrated by points C and 
D in Fig. 9.2b, both of which represent extreme values, and both of which are characterized 
by a zero slope—/ v (*t) = 0 and f'{xi) = 0. It is also easy to see that when the slope is 
nonzero we cannot possibly have a relative minimum (the bottom of a valley) or a relative 
maximum (the peak of a hill). For this reason, we can, in the context of smooth functions, 
take the condition f{x) = 0 to be a necessary condition for a relative extremum (either 
maximum or minimum). 

We must hasten to add, however, that a zero slope, while necessary, is not sufficient to 
establish a relative extremum. An example of the case where a zero slope is not associated 
with an extremum will be presented shortly. By appending a certain proviso to the zero- 
slope condition, however, we can obtain a decisive test for a relative extremum. This may 
be stated as follows: 

First-derivative test for relative extremum If the first derivative of a function f(x) at 
.x = xt) is /'(xq) = 0, then the value of the function at /(.to), will be 

a. A relative maximum if the derivative f\x) changes its sign from positive to negative 

from the immediate left of the point xo to its immediate right. 





224 Part Four Optimization Problems 


b. A relative minimum if f'(x) changes its sign from negative to positive from the imme¬ 
diate left of x 0 to its immediate right. 

c. Neither a relative maximum nor a relative minimum if f(x) has the same sign on both 
the immediate left and the immediate right of point X(>. 

Let us call the value x 0 a critical value of x if f'{x 0 ) = 0, and refer to f(x 0 ) as a sta¬ 
tionary value of y (or of the function/). The point with coordinates x 0 and f(x o) can, 
accordingly, be called a stationary point. (The rationale fOT the word stationary should be 
self-evident—wherever the slope is zero, the point in question is never situated on an 
upward or downward incline, but is rather at a standstill position.) Then, graphically, the 
first possibility listed in this test will establish the stationary point as the peak of a hill, such 
as point D in Fig. 9.26, whereas the second possibility will establish the stationary point as 
the bottom of a valley, such as point C in the same diagram. Note, however, that in view of 
the existence of a third possibility, yet to be discussed, we are unable to regard the condi¬ 
tion f'{x) = 0 as a sufficient condition for a relative extremum. But we now see that, if the 
necessary condition fix) = 0 is satisfied, then the change-of-derivative-sign proviso can 
serve as a sufficient condition for a relative maximum or minimum, depending on the 
direction of tue sign change. 

Let us now explain the third possibility. In Fig. 9.3a, the function/is shown to attain 
a zero slope at point J (when x = j). Even though f\j) is zero—which makes /(/) a 
stationary value—the derivative does not change its sign from one side of x = j to the 
other; therefore, according to the first-derivative test, point J gives neither a maximum nor 
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Example 1 


a minimum, as is duly confirmed by the graph of the function. Rather, it exemplifies what 
is known as an inflection point 

The characteristic feature of an inflection point is that, at that point, the derivative (as 
against the primitive) function reaches an extreme value. Since this extreme value can be 
either a maximum or a minimum, we have two types of inflection points. In Fig. 9.3a', 
where we have plotted the derivative /'(.r), we see that its value is zero when x = j (see 
point J') but is positive on both sides of point this makes J' a minimum point of the 
derivative function f‘(x). 

The other type of inflection point is portrayed in Fig. 9.3b, where the slope of the func¬ 
tion £(,v) increases till the point k is reached and decreases thereafter. Consequently, the 
graph of the derivative function g'(x) will assume the shape shown in Fig. 9.3b', where 
point K' gives a maximum value of the derivative function g'(x).* 

To sum up: A relative extremum must be a stationary value, but a stationary value may 
be associated with either a relative extremum or an inflection point. To find the relative 
maximum or minimum of a given function, therefore, the procedure should be first to find 
the stall onary values of the function where the condition f'{x) = 0 is satisfied, and then to 
apply the first-derivative test to determine whether each of the stationary values is a relative 
maximum, a relative minimum, or neither. 

Find the relative extrema of the function 

y = f(x) = x 3 - 1 2x 2 -f 36x + 8 
First, we find the derivative function to be 

f(x) = lx 2 - 24x + 36 

To get the critical values, i.e., the values of x satisfying the condition f'(x) = 0, we set the 
quadratic derivative function equal to zero and get the quadratic equation 

3x 2 - 24* + 36 = 0 

By factoring the polynomial or by applying the quadratic formula, we then obtain the 
following pair of roots (solutions): 

xf = 6 [at which we have f'(6) = 0 and f{ 6) = 8] 

X 2 = 2 [at which we have f'(2) = 0 and f(2) = 40] 

Since f'(6) = f'(2) = 0, these two values of x are the critical values we desire. 

It is easy to verify that, in the immediate neighborhood ofx =6, we have f'(x) < 0 for 
x < 6, and f\x) > 0 for x > 6; thus the value of the function /(6) = 8 is a relative min¬ 
imum. Similarly, since, in the immediate neighborhood of x = 2, we find f'(x) > 0 for 
x <2, and f'(x) < 0 for x > 2, the value of the function /(2) = 40 is a relative 
maximum. 

f Note that a 2 ero derivative value, while a necessary condition for a relative extremum, is not 
required for an inflection point; for the derivative g'(x) has a positive value at x = *, and yet point K is 
an inflection point. 
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Figure 9,4 shows the graph of the function of this example. Such a graph may be used 
to verify the location of extreme values obtained through use of the first-derivative test, 
But, in reality, in most cases “helpfulness” flows in the opposite direction—the mathemat¬ 
ically derived extreme values will help in plotting the graph, The accurate plotting of a 
graph ideally requires knowledge of the value of the function at every point in the domain; 
but as a matter of actual practice, only a few points in the domain are selected for purposes 
of plotting, and the rest of the points typically are filled in by interpolation. The pitfall of 
this practice is that, unless we hit upon the stationary point(s) by coincidence, we shall miss 
the exact location of the turning point(s) in the curve. Now, with the first-derivative test at 
our disposal, it becomes possible to locate these turning points precisely. 


Example 2 


Find the relative extremum of the average-cost function 

AC = f(Q)= Q 2 - 5Q + 8 

The derivative here is f'(Q) = 2Q - 5, a linear function. Setting f'(Q) equal to zero, we get 
the linear equation 2Q - 5 = 0, which has the single root Q* = 2.S. This is the only critical 
value in this case. To apply the first-derivative test, let us find the values of the derivative 
at, say, Q = 2.4 and Q = 2.6, respectively. Since f'(2.4) = -0.2 < 0 whereas V(l. 6) = 
0.2 > 0, we can conclude that the stationary value AC = f(2.S) = 1.75 represents a relative 
minimum. The graph of the function of this example is actually a U-shaped curve, so that 
the relative minimum already found will also be the absolute minimum. Our knowledge of 
the exact location of this point should be of great help in plotting the AC curve. 


EXERCISE 9.2 

1. Find the stationary values of the following (check whether they are relative maxima or 
minima or inflection points), assuming the domain to be the set of all real numbers: 
(o) y =-2x 2 +&x+ 7 (b)y=5x 2 -x (c)y=3x 2 + 3 (d) y - 3x 2 - 6x + 2 
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2. Find the stationary values of the following (check whether they are relative maxima or 
minima or inflection points), assuming the domain to be the interval [0, oo): 

(a) y = x 3 - 3x + 5 

(b) y= ±x 3 -r 2 + r + 1Q 
(0 y = -X 3 + 4.5x 2 - 6x + 6 

3. Show that the function y=sx+]/x (with x 0) has two relative extrema, one a 
maximum and the other a minimum. Is the "minimum" larger or smaller than the 
''maximum''? How is this paradoxical result possible? 

4. Let T = tp{x) be a total function (e.g., total product or total cost): 

(a) Write out the expressions for the marginal function M and the average function A. 

(b) Show that, when A reaches a relative extremum, M and A must have the same 
value. 

(c) What general principle does this suggest for the drawing of a marginal curve and 
an average curve in the same diagram? 

{(f) What can you conclude about the elasticity of the total function 7 at the point 
where A reaches an extreme value? 


9,3 Second and Higher Derivatives _ 

Hitherto we have considered only the first derivative f'(x) of a function y = /(.v); now let 
us introduce the concept of second derivative (short for second-order derivative ), and 
derivatives of even higher orders. These will enable us to develop alternative criteria for 
locating the relative extrema of a function. 

Derivative of a Derivative 

Since the first derivative fix) is itself a function of*, it. too, should be differentiable with 
respect to x, provided that it is continuous and smooth. The result of this differentiation, 
known as the second derivative of the function j\ is denoted by 

/"(*) where the double prime indicates that f(x) has been differentiated with 
respect to * twice, and where the expression (,r) following the double 
prime suggests that the second derivative is again a function of x 
or 
i i~y 

—- where the notation stems from the consideration that the second derivative 

ih , * fdy\ , 

means, in fact. — f — }; hence, the d L (read: “(/-two ’) in the numerator 

and dx 2 (read: “dx squared'") in the denominator of this symbol. 

If the second derivative f"(x) exists for all .rvalues in the domain, the function fit r) is said 
to be twice differentiable: if, in addition, /"(.r) is continuous, the function fix) is said to 
be twice continuously differentiable. Just as the notation f 6 C’ (l) or /' e C is often used 
to indicate that the function/is continuously differentiable, an analogous notation 

/eC< 21 or /eC" 

can be used to signify that/is twice continuously differentiable. 
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As a function of x the second derivative can be differentiated with respect to x again to 
produce a third derivative, which in turn can be the source of a fourth derivative, and so on 
ad infinitum, as long as the differentiability condition is met. These higher-order derivatives 
are symbolized along the same line as the second derivative: 

/"'(*)> / 4| (-r), • • -, f (rl '(x) [with superscripts enclosed in ( )] 

d 3 y d 4 y d n y 
dx r dx 4 ' ” dx” 

d n d* 

The last of these can also be written as y, where the— part serves as an operator 

dx n dx n 

symbol instructing us to take the nth derivative of (some function) with respect to x. 

Almost all the specific functions we shall be working with possess continuous deriva¬ 
tives up to any order we desire; i.e., they are continuously differentiable any number of 
times. Whenever a general function is used, such as f(x), we always assume that it has 
derivatives up to any order we need. 


Example 1 


Find the first through the fifth derivatives of the function 

y = f(x) ~ Ax* - x 3 +17x 2 -t- 3x - 1 
The desired derivatives are as follows: 

f(x) = 16x 3 - 3x 2 + 34* + 3 
f'(x) = 48x 2 - 6x + 34 
f"'(x) = 9 6x - 6 
f w (x) = 96 
P\x) = 0 

In this particular (polynomial) example, we note that each successive derivative function 
emerges as a lower-order polynomial—from cubic to quadratic, to linear, to constant. We 
note also that the fifth derivative, being the derivative of a constant, is equal to zero for all 
values of x; we could therefore have written it as f (5) (x) = 0 as well, The equation 
f^(x) = 0 should be carefully distinguished from the equation f^(x o) = 0 (zero at xq 
only). Also, understand that the statement f (5) (x) = 0 does not mean that the fifth deriva¬ 
tive does not exist; it indeed exists, and has the value zero, 


Example 2 


Find the first four derivatives of the rational function 


k = 9W = TT7 


These derivatives can be found either by use of the quotient rule, or, after rewriting the 
function as y = x(1 + x) _> , by the product rule: 


g'W = (l+*)- 2 

g"(x) = -2(1+x)- 3 
g"'(x) = 6(1 + x)- 4 
g (4) (x) = -24(1 + x) _s 




In this case, repeated derivation evidently does not tend to simplify the subsequent deriva¬ 
tive expressions. 
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Note that, like the primitive function g(x), all the successive derivatives obtained are 
themselves functions of*. Given specific values of*, however, these derivative functions 
will then take specific values. When* = 2, for instance, the second derivative in Example 2 
can be evaluated as 


*"(2) - -2(3) 3 = ^ 

and similarly for other values of*. It is of the utmost importance to realize that to evaluate 
this second derivative g"(x) at* = 2, as we did, wc must first obtain g"(x) from g(x) and 
then substitute x = 2 into the equation for g"ix). It is incorrect to substitute x = 2 into 
g(x) or g'(x) prior to the differentiation process leading to g"(x ). 


Interpretation of the Second Derivative 

The derivative function /'(*} measures the rate of change of the function f. By the same 
token, the second-derivative function /" is the measure of the rate of change of the first 
derivative /'; in other words, the second derivative measures the rate of change of the rate 
of change of the original function f. To put it differently, with a given infinitesimal increase 
in the independent variable x from a point x = *o. 


n*o) > o 

f(x n ) < 0 


means that the value of the function tends to 


increase 

decrease 


whereas, with regard to the second derivative, 


f’\x 0 ) > 0 
f"(x o) < 0 


means that the slope of the curve tends to 


increase 

decrease 


Thus a positive first derivative coupled with a positive second derivative at x ~ *o 
implies that the slope of the curve at that point is positive and increasing. In other words, 
the value of the function is increasing at an increasing rate, Likewise, a positive first deriv¬ 
ative with a negative second derivative indicates that the slope of the curve is positive hut 
decreasing —the value of the function is increasing at a decreasing rate. The case of a neg¬ 
ative first derivative can be interpreted analogously, but a warning is in order in this case: 
When / '(xo) < 0 and /”(xo) > 0, the slope of the curve is negative and increasing, but 
this does not mean that the slope is changing, say, from (-10) to (-11); on the contrary, the 
change should be from (-11), a smaller number, to (-10), a larger number. In other words, 
the negative slope must tend ty be less steep as x increases. Lastly, when / 7 (x 0 | < 0 and 
/"(*«) < 0, the slope of the curve must be negative and decreasing. This refers to a nega¬ 
tive slope that tends to become steeper as x increases. 

All of this can be further clarified with a graphical explanation. Figure 9.5a illustrates a 
function with f"{x) < 0 throughout. Since the slope must steadily decrease as x increases 
on the graph, we will, when wc move from left to right, pass through a point A with a pos¬ 
itive slope, then a point B with zero slope, and then a point C with a negative slope. It may 
happen, of course, that a function with f"{x) < 0 is characterized by f\x) > 0 every¬ 
where, and thus plots only as the rising portion of an inverse U-shaped curve, or, with 
f'{x) < 0 everywhere, plots only as the declining portion of that curve. 

The opposite case of a function with f"(x) > 0 throughout is illustrated in Fig. 9.5 b. 
Here, as we pass through points D to E to F, the slope steadily increases and changes from 
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FIGURE 9.5 




negative to zero to positive. Again, we add that a function characterized by f"(x) > 0 
throughout may, depending on the first-derivative specification, plot only as the declining 
or the rising portion of a U-shaped curve. 

From Fig. 9.5, it is evident that the second derivative f"{x) relates to the curvature of a 
graph; it determines how the curve tends to bend itself. To describe the two types of differ¬ 
ing curvatures discussed, we refer to the one in Fig. 9.5a as strictly concave, and the one in 
Fig. 9.5b as strictly convex. And, understandably, a function whose graph is strictly concave 
(strictly convex) is called a strictly concave (strictly convex) function. The precise geomet¬ 
ric characterization of a strictly concave function is as follows. If we pick any pair of points 
M and (Von its curve and join them by a straight line, the line segment MN must lie entirely 
below the curve, except at points M and N. The characterization of a strictly convex func¬ 
tion can be obtained by substituting the word above for the word below in the last statement. 
Try this out in Fig. 9.5. If the characterizing condition is relaxed somewhat, so that the line 
segment MN is allowed to lie either below the curve, or along (coinciding with) the curve, 
then we will be describing instead a concave function, without the adverb strictly. Simi¬ 
larly, if the line segment MN either lies above, or lies along the curve, then the function is 
convex, again without the adverb strictly. Note that, since the line segment MW may coin¬ 
cide with a (nonstrictly) concave or convex curve, the latter may very well contain a linear 
segment, In contrast, a strictly concave or convex curve can never contain a linear segment 
anywhere, It follows that while a strictly concave (convex) function is automatically a con¬ 
cave (convex) function, the converse is not true.* 

From our earlier discussion of the second derivative, we may now infer that if the sec¬ 
ond derivative fix) is negative for allx, then the primitive function f(x) must be a strictly 
concave function. Similarly, f(x) must be strictly convex, if fix) is positive for all x. 
Despite this, it is not valid to reverse this inference and say that, if f(x) is strictly concave 
(strictly convex), then f"{x) must be negative (positive) for all x. This is because, in certain 
exceptional cases, the second derivative may have a zero value at a stationary point on such 
a curve. An example of this can be found in the function y — f(x) = x x , which plots as a 
strictly convex curve, hut whose derivatives 

f(x) = 4x 3 f"(x) = \lx l 


t We shall discuss these concepts further in Sec. 11.5. 
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indicate that, at the stationary point where x = 0, the value of the second derivative is 
/"(0) = 0. Note, however, that at any other point, with x f 0, the second derivative of this 
function does have the (expected) positive sign. Aside from the possibility of a zero value 
at a stationary point, therefore, the second derivative of a strictly concave or convex func¬ 
tion may be expected in general to adhere to a single algebraic sign. 

For other types of function, the second derivative may take both positive and negative 
values, depending on the value of x. In Fig. 9.2a and b, for instance, both f(x ) and gU) 
undergo a sign change in the second derivative at their respective inflection points,/ and K. 
According to Fig. 9.3a', the slope of f'(x )—that is, the value of f"(x) -changes from 
negative to positive at x = J; the exact opposite occurs with the slope o{g'{x) —that is, the 
value of g"{x )—on the basis of Fig. 9.3 b'. Translated into curvature terms, this means that 
the graph of j\x) turns from strictly concave to strictly convex at point./, whereas the 
graph ofg(x) has the reverse change at point K. Consequently, instead of characterizing an 
inflection point as a point where the first derivative reaches an extreme value, wc may 
alternatively characterize it as a point where the function undergoes a change in curvature 
or a change in the sign of its second derivative. 

An Application 

The two curves in Fig. 9.5 exemplify the graphs of quadratic functions, which may be 
expressed generally in the form 

v = ax 1 + bx + c (a ^ 0) 

From our discussion of the second derivative, we can now derive y convenient way of 
determining whether a given quadratic function will have a strictly convex (U-shaped) or 
a strictly concave (inverse U-shaped) graph. 

Since the second derivative of the quadratic function cited is d 2 y/dx 1 = 2a, this deriv¬ 
ative will always have the same algebraic sign as the coefficient a. Recalling that a positive 
second derivative implies a strictly convex curve, we can infer that a positive coefficient a 
in the preceding quadratic function gives rise to a U-shaped graph. In contrast, a negative 
coefficient a leads to a strictly concave curve, shaped like an inverted U 

As intimated at the end of Sec. 9.2, the relative extremum of this function will also prove 
to be its absolute extremum, because in a quadratic function there can be found only a 
single valley or peak, evident in a U or inverted U, respectively. 


Attitudes toward Risk 

The most common application of the concept of marginal utility is to the context of goods 
consumption. But in another useful application, we consider the marginal utility of income. 
or more to the point of the present discussion, the payoff to a betting game, and use this 
concept to distinguish between different individuals’ attitudes toward risk. 

Consider the game where, for a fixed sum of money paid in advance (the cost of 
the game), you can throw a die and collect $10 if an odd number shows up, or S20 if the 
number is even, In view of the equal probability of the two outcomes, the mathematically 
expected value ofpayoff is 


EV = 0.5x$10 +0.5x520 = SI 5 
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FIGURE 9.6 



The game is deemed a fair game, or fair bei, if the cost of the game is exactly $ 15. Despite 
its fairness, playing such a game still involves a risk, for even though the probability distri¬ 
bution of the two possible outcomes is known, the actual result of any individual play is not. 
Hence, people who are “risk-averse” would consistently decline to play such a game. On 
the other hand, there are “risk-loving” or “risk-preferring” people who would welcome fair 
games, or even games with odds set against them (i.e., with the cost of the game exceeding 
the expected value of payoff). 

The explanation for such diverse attitudes toward risk is easily found in the differing 
utility functions people possess. Assume that a potential player has the strictly concave util¬ 
ity function U = U{x) depicted in Fig. 9.6a, where x denotes the payoff, with t/(0) = 0, 
U'(x) > 0 (positive marginal utility of income or payoff), and U"{x) < 0 (diminishing 
marginal utility) for all x. The economic decision facing this person involves the choice 
between two courses of action: First, by not playing the game, the person saves the $15 cost 
of the game (= EV) and thus enjoys the utility level t/($15), measured by the height of 
point A on the curve. Second, by playing, the person has a .5 probability of receiving $10 
and thus enjoying C/($10) (see point M), plus a .5 probability of receiving $20 and thus 
enjoying (7($20) (see point N). The expected utility from playing is, therefore, equal to 

EU = 0.5 x U($] 0) + 0.5 x t/($20) 

which, being the average of the height of M and that of N, is measured by the height of point 
B, the midpoint on the line segment MN. Since, by the defining property of a strictly con¬ 
cave utility function, line segment MN must lie below arc MN, point B must be lower than 
points; that is, EU, the expected utility from playing, falls short of the utility of the cost of 
the game, and the game should be avoided. For this reason, a strictly concave utility (unc¬ 
tion is associated with risk-averse behavior. 

For a risk-loving person, the decision process is analogous, but the opposite choice will 
be made, because now the relevant utility function is a strictly convex one. In Fig. 9.6b, 
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C/(S 15 >, the utility of keeping the S15 by not playing the game, is shown by point A' on the 
curve, and EU, the expected utility from playing, is given by B', the midpoint on the line 
segment M'N'. But this lime line segment AT X" lies above arc A/'A'', and point B' is above 
point A'. Thus there definitely is a positive incentive to piay the game. In contrast to the sit¬ 
uation in Fig. 9.6a, we can thus associate a strictly convex utility Junction with risk-loving 
behavior. 

EXERCISE 9.3 

1. Find the second and third derivatives of the following functions: 

3x 

(i a)ax 2 + bx + c (c) j (*^1) 

(b) 7x A -3x-4 (d) (**1) 

2. Which of the following quadratic functions are strictly convex? 

(o) y = 9x 2 -4x + 8 (c) u = 9 - 2x 2 

(W w = -lx 2 + 39 (d) v = 8 - 5x + x 2 

3. Draw (a) a concave curve which is nof strictly concave, and (b) a curve which qualifies 
simultaneously as a concave curve and a convex curve. 

4. Given the function y= a - (a, b,c> 0: x > 0), determine the general shape of 

C H“ X 

its graph by examining (a) its first and second derivatives, ( b ) its vertical intercept, and 
(e) the limit Of y as x tends to infinity. If this function is to be used as a consumption func¬ 
tion, how should the parameters be restricted in order to make it economically sensible? 

5. Draw the graph of a function f(x) such that f'(x) = 0, and the graph of a function g(x) 
such that 5f(3) = 0. Summarize in one sentence the essential difference between f(x) 
and g{x) in terms of the concept of stationary point. 

6. A person who is neither risk-averse nor risk-loving (indifferent toward a fair game) is 
said to be "risk-neutral." 

(a) What kind of utility function would you use to characterize such a person? 

(b) Using the die-throwing game detailed in the text, describe the relationship between 
.1/(115) and EU for the risk-neutral person. 

9.4 Second-Derivative Test _ 

Returning to the pair of extreme points B and E in Fig. 9.5 and remembering the newly 
established relationship between the second derivative and the curvature of a curve, we 
should be able to sec the validity of the following criterion for a relative extremum: 

Second-derivative test for relative extremum I f the value of the first derivative of a func¬ 
tion/an- = ,v 0 is /'( X(,) = 0, then the value of the function at .v n . f(x»), will be 

a. A relative maximum if the second-derivative value at .cy is f”(x o) < 0. 

b. A relative minimum if the second-derivative value at .Vo is f"(xo) > 0. 

This test is in general more convenient to use than (he first-derivative test, because it does 
not require us to check the derivative sign to both the left and the right of .c () . But it has the 
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Example 1 


Example 2 


drawback that no unequivocal conclusion can be drawn in the event that f"(x o) = 0. For then 
the stationary value f{x o) can be either a relative maximum, or a relative minimum, or even 
an inflectional value. f When the situation of f"(x o) = 0 is encountered, we must either revert 
to the first-derivative test, or resort to another test, to be developed in Sec. 9.6, that involves 
the third or even higher derivatives. For most problems in economics, however, the second- 
derivative test would usually be adequate for determining a relative maximum or minimum. 

Find the relative extremum of the function 

y= f(x) = 4x 2 - x 

The first and second derivatives are 

/'(*) = 8x -1 and f"(x) = 8 

Setting f'(x) equal to zero and solving the resulting equation, we find the (only) critical 
value to be x’ = which yields the (only) stationary value j(*) ~ i«- Because the 
second derivative is positive (in this case it is indeed positive for any value of x), the ex¬ 
tremum is established as a minimum. Further, since the given function plots as a U-shaped 
curve, the relative minimum is also the absolute minimum. 

Find the relative extrema of the function 

y = g(x) = x 3 — 3x 2 4- 2 
The first two derivatives of this function are 

g'(x) = 3x 2 -6x and g"(x) = 6x - 6 

Setting g'(x) equal to zero and solving the resulting quadratic equation, 3x 2 - 6x = 0, we 
obtain the critical values xj - = 2 and x\ = 0, which in turn yield the two stationary values: 

g( 2) = -2 [a minimum because g"(2) = 6 > 0} 

g(0) = 2 [a maximum because g"(0) = -6 < 0] 

Necessary versus Sufficient Conditions 

As was the case with the first-derivative test, the zero-slope condition /'fit) - 0 plays the 
role of a necessary condition in the second-derivative test. Since this condition is based on 
the first-order derivative, it is often referred to as the first-order condition. Once we find the 
first-order condition satisfied at x — xo, the negative (positive) sign of /"(x (> ) is sufficient 
to establish the stationary value in question as a relative maximum (minimum). These suf¬ 
ficient conditions, which are based on the second-order derivative, are often referred to as 
second-order conditions. 


t To see that an inflection point is possible when f"(x 0 ) = 0, let us refer back to Fig, 9.3a and 9.3a'. 
Point / in the upper diagram is an inflection point, with x = / as its critical value. Since the f'(x) 
curve in the lower diagram attains a minimum at x = j, the slope of f(x) [i.e., f"fir)] must be zero 
at the critical value x = /, Thus point / illustrates an inflection point occurring when f"(xo) — 0. 

To see that a relative extremum is also consistent with f"(x o) = 0, consider the function y = x 4 . 
This function plots as a U-shaped curve and has a minimum, y = 0, attained at the critical value 
x = 0. Since the second derivative of this function is f"(x) = 12x 2 , we again obtain a zero value for 
this derivative at the critical value x = 0. Thus this function illustrates a relative extremum occurring 
when f"{xo) = 0. 
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TABLE 9.1 
Conditions for 
a Relative 
Extremum: 

>• =m 


Condition 

Maximum 

Minimum 

First-order necessary 

f'(x) = 0 

f'(x) = 0 

Second-order necessary 1 

f"(x) < 0 

f"(x) > 0 

Second-order sufficient' 

r<x) < o 

f"(x) > 0 


'Applicable only after ®e lim-order necessary condition has been satisfied 


It bears repeating that the first-order condition is necessary, but not sufficient, for a rel¬ 
ative maximum or minimum. (Remember inflection points?) In sharp contrast, the second- 
order condition that f"{x) be negative (positive) at the critical value .v fl is sufficient for a 
relative maximum (minimum), but it is not necessary. [Remember the relative extremum 
that occurs when /"(.ty) = 0?] For this reason, one should carefully guard against the fol¬ 
lowing line of argument: “Since the stationary value f{x o) is already known to be a mini¬ 
mum, wc must have f'(x<ff) > 0.” The reasoning here is faulty because it incorrectly treats 
the positive sign of f"{xo) as a necessary condition for /(.to) to be a minimum. 

This is not to say that second-order derivatives can never be used in staling necessary 
conditions for relative extrema. Indeed they can. But care must then be taken to allow for 
the fact that a relative maximum (minimum) can occur not only when /"Uo) is negative 
(positive), but also when /"(.t ( ,) is zero. Consequently, second-order necessary conditions 
must be couched in terms of weak inequalities: for a stationary value f(x n ) to be a relative 


maximum 


< 


> 


0 . 


, it is necessary that f ( rn) 

| minimum | 

The preceding discussion can be summed up in Table 9.1. All the equations and in¬ 
equalities in the table are in the nature of conditions (requirements) to be met, rather than 
descriptive specificationsofa given function. In particular, the equation f'{x) = 0 does not 
signify that function/has a zero slope everywhere; rather, it states the stipulation that only 
those values of a- that satisfy this requirement can qualify as critical values. 


Conditions for Profit Maximization 

We shall now present an economic example of extreme-value problems, i.c., problems of 
optimization, 

One of the first things that a student of economics learns is that, in order to maximize 
profit, a firm must equate marginal cost and marginal revenue, Let us show the mathemat¬ 
ical derivation of this condition. To keep the analysis on a genera! level, we shall work with 
the total-revenue function R = R(Q) and total-cost function C = C( 0, both of which are 
functions ofa single variable Q. From these it follows that a profit function (the objective 
function) may also be formulated in terms of Q (the choice variable): 


x=n{Q) = R{Q)-C(Q) (9.1) 

To find the profit-maximizing output level, we must satisfy the first-order necessary 
condition for a maximum: dn/dQ = 0. Accordingly, let us differentiate (9.1) with respect 
to Q and set the resulting derivative equal to zero; The result is 

'{Q) = R f (Q)-C'(Q) 
dQ 

= 0 iff R'(Q) = C(Q) 


(9.2) 
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Thus the optimum output ( equilibrium output) Q" must satisfy the equation R'{Q*) = 
C\Q*), or MR = MC. This condition constitutes the first-order condition for profit 
maximization. 

However, the first-order condition may lead to a minimum rather than a maximum; thus 
we must check the second-order condition next. We can obtain the second derivative by 
differentiating the first derivative in (9.2) with respect to Q : 

= n'XQ) = R’XQ) - C’XQ) 

dQ l 

<0 iff R"(Q)<C"{Q) 

This last inequality is the second-order necessary condition for maximization. If it is not 
met, then Q * cannot possibly maximize profit; in fact, it minimizes profit. If R"{Q‘) — 
then we are unable to reach a definite conclusion. The best scenario is to find 
R"(Q‘) < C"(Q‘), which satisfies the second-order sufficient condition for a maximum. 
In that case, we can conclusively take Q* to be a profit-maximizing output. Economically, 
this would mean that, if the rate of change of MR is less than the rate of change of MC at 
the output where MC = MR, then that output will maximize profit. 

These conditions are illustrated in Fig. 9.7. In Fig. 9.7 a we have drawn a total-revenue 
and a total-cost curve, which are seen to intersect twice, at output levels of Q 2 and Q\. In 
the open interval ( Q 2 , Q 4 ), total revenue R exceeds total cost C, and thus n is positive. But 
in the intervals [0, Q 2 ) and ({74, Qs\, where Q$ represents the upper limit of the firm’s pro¬ 
ductive capacity, k is negative. This fact is reflected in Fig. 9.7 b, where the profit curve— 
obtained by plotting the vertical distance between the R and C curves for each level of 
output—lies above the horizontal axis only in the interval (Qi, Q 4 ). 

When we set drr/dQ = 0, in line with the first-order condition, it is our intention to 
locate the peak point K on the profit curve, at output Qs, where the slope of the curve is 
zero. However, the relative-minimum point M (output Q\) will also offer itself as a candi¬ 
date, because it, too, meets the zero-slope requirement. Below, we shall resort to the 
second-order condition to eliminate the “wrong” kind of extremum. 

The first-order condition dnjdQ = 0 is equivalent to the condition R'lQ) = C'{Q). In 
Fig. 9.7a, the output level Qi satisfies this, because the R and C curves do have the same 
slope at Qi (the tangent lines drawn to the two curves at H and J are parallel to each other). 
The same is true for output Q\. Since the equality of the slopes of R andCmeansthe equal¬ 
ity of MR and MC, outputs Qi and Q\ must obviously be where the MR and MC curves 
intersect, as illustrated in Fig. 9.7c. 

How does the second-order condition enter into the picture? Let us first look at Fig. 9.76. 
At point K, the second derivative of the n function will (barring the exceptional zero-value 
case) have a negative value, n"(Qi) < 0, because the curve is inverse U-shaped around AT; 
this means that Qi will maximize profit, At point M, on the other hand, we would expect 
that n”(Qi) > 0; thus Q\ provides a relative minimum for n instead. The second- 
order sufficient condition for a maximum can, of course, be stated alternatively as 
R"(Q) < C"( Q), that is, that the slope of the MR curve be less than the slope of the MC 
curve. From Fig. 9.7c, it is immediately apparent that output Qi satisfies this condition, 
since the slope of MR is negative while that of MC is positive at point L. But output Q\ 
violates this condition because both MC and MR have negative slopes, and that of MR is 
numerically smaller than that of MC at point N, which implies that R"{Q\) is greater than 
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Example 3 


Let the R(Q) and C(Q) functions be 

fl(Q) = 1,200Q-2Q 2 

C{(?) = C? 3 - 61.25Q 2 + 1,528.5 Q + 2, 000 


Then the profit function is 

jt(0) = -Q 3 + 59.25Q 2 - 328.5Q - 2,000 


where R, C, and jt are all in dollar units and Q is in units of (say) tons per week, This profit 
function has two critical values, Q = 3 and Q = 36.5, because 


— = -3 Q 2 + 118.5Q - 328.5 = 0 
dQ 


when Q = 


3 

36.5 


But since the second derivative is 


d 2 n 

dQ 2 


= -6Q+118.5 


> 0 when Q = 3 
< 0 when Q = 36.5 


the profit-maximizing output is 0' = 36,5 (tons per week). (The other output minimizes 
profit.) By substituting Q* into the profit function, we can find the maximized profit to be 
tt* = ;r(36.5) = 16,318.44 (dollars per week). 

As an alternative approach to the preceding, we can first find the MR and MC functions 
and then equate the two, i.e., find their intersection. Since 


fi'(Q) = 1,200-4Q 

C'(Q) = 3Q 2 - 122.5 Q +1,528.5 

equating the two functions will result in a quadratic equation identical with dn/dQ = 0 
which has yielded the two critical values of 0 cited previously. 


Coefficients of a Cubic Total-Cost Function 

In Example 3, a cubic function is used to represent the total-cost function. The traditional 
total-cost curve C = C(£)), as illustrated in Fig. 9.7a, is supposed to contain two wiggles 
that form a concave segment (decreasing marginal cost) and a subsequent convex segment 
(increasing marginal cost), Since the graph of a cubic function always contains exactly two 
wiggles, as illustrated in Fig. 9.4, it should suit that role well. However, Fig. 9.4 immedi¬ 
ately alerts us to a problem: the cubic function can possibly produce a downward-sloping 
segment in its graph, whereas the total-cost function, to make economic sense, should be 
upward-sloping everywhere (a larger output always entails a higher total cost). If we wish 
to use a cubic total-cost function such as 

C = C(Q)=aQ } + bQ 2 + cQ + d (9.3) 

therefore, it is essential to place appropriate restrictions on the parameters so as to prevent 
the C curve from ever bending downward. 

An equivalent way of stating this requirement is that the MC function should be positive 
throughout, and this can be ensured only if the absolute minimum of the MC function turns 
out to be positive. Differentiating (9.3) with respect to Q, we obtain the MC function 

MC = CXQ) = laQ 2 + 2bQ + c (9.4) 
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which, because it is a quadratic, plots as a parabola as in Fig. 9.7c. In order for the MC 
curve to stay positive (above the horizontal axis) everywhere, it is necessary that the 
parabola be U-shaped (otherwise, with an inverse U, the curve is bound to extend itself into 
the second quadrant). Hence the coefficient of the Q 2 term in (9.4) has to be positive; i.e., 
we must impose the restriction a > 0. This restriction, however, is by no means sufficient, 
because the minimum value of a U-shaped MC curve—call it MC min (a relative minimum 
which also happens to be an absolute minimum)—may still occur below the horizontal 
axis, Thus we must next find MCmin and ascertain the parameter restrictions that would 
make it positive. 

According to our knowledge of relative extremum, the minimum of MC will occur 
where 

d 

— MC = 6uQ + 2fr = 0 
dQ 


The output level that satisfies this first-order condition is 


Q* 


-2b 
6 a 


-b 
3 a 


Thi s minimizes (rather than maximizes) MC because the second derivative t/ 2 (MC) /dQ 2 = 
6 a is assuredly positive in view of the restriction a > 0, The knowledge of Q * now enables 
us to calculate MC ni j n . but wc may first infer the sign of coefficient b from it. Inasmuch as 
negative output levels arc ruled out, we see that b can never be positive (given a > 0), 
Moreover, since the law of diminishing returns is assumed to set in at a positive output level 
(that is, MC is assumed to have an initial declining segment), Q* should be positive (rather 
than zero). Consequently, we must impose the restriction b < 0. 

It is a simple matter now to substitute the MC-minimiziiig output Q" into (9.4) to find 
that 


MCmin —■ 3a 


-b 
3 a 


2b — 
3 a 


c — 


3a c - b 
3a 


Thus, to guarantee the positivity of MC m i n , we must impose the restriction^ b 2 < 3ac. This 
last restriction, we may add, in effect also implies the restriction c > 0, (Why?) 

The preceding discussion has involved the three parameters a, b, and c. What about the 
other parameter, dl The answer is that there is need for a restriction on d also, but that has 
nothing to do with the problem of keeping the MC positive. If we let Q = 0 in (9.3), we find 


' This restriction may also be obtained by the method of completing the square. The MC function can 
be successively transformed as follows: 

MC = loQ 2 4- 2bQ + c 
= (laQ 2 + 2bQ i 


/ z— ;V 

= ^ q + V35 


b z + 3 ac 
3 a 


Since the squared expression can possibly be 2ero, we must, in order to ensure the positivity of MC, 
require that b 2 < lac on the knowledge that a > 0. 
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that C(0) =: d. The role of d is thus to determine the vertical intercept of the C curve only, 
with no bearing on its slope. Since the economic meaning of d is the fixed cost of a firm, 
the appropriate restriction (in the short-run context) would be d > 0. 

In sum, the coefficients of the total-cost junction (9.3) should be restricted as follows 
(assuming the short-run context): 

a,c,d > 0 h < 0 b 2 < 3 ac (9.5) 

As you can readily verify, the C(Q) function in Example 3 does satisfy (9.5). 

Upward-Sloping Marginal-Revenue Curve 

The marginal-revenue curve in Fig. 9.7c is shown to be downward-sloping throughout, 
This, of course, is how the MR curve is traditionally drawn for a firm under imperfect com¬ 
petition. However, the possibility of the MR curve being partially, or even wholly, upward- 
sloping can by no means be ruled out a priori.^ 

Given an average-revenue function AR = /{{?), the marginal-revenue function can be 
expressed by 

MR = f(Q) + Qf'(Q) [from (7.7)] 


The slope of the MR curve can thus be ascertained from the derivative 


—MR = f'(Q) + f{Q) + Qf(Q) = 2f\Q) + Qf(Q ) 

aQ. 


As long as the AR curve is downward-sloping (as it would be under imperfect competition), 
the2/'(0 term is assuredly negative. But the Qf"(Q) term can be either negative, zero, 
or positive, depending on the sign of the second derivative of the AR function, i.e., depend¬ 
ing on whether the AR curve is strictly concave, linear, or strictly convex. If the AR curve 
is strictly convex either in its entirety(as illustrated in Fig. 7,2) or along a specific segment, 
the possibility will exist that the (positive) Qf”(Q) term may dominate the (negative) 
2f(Q) term, thereby causing the MR curve to be wholly or partially upward-sloping. 


Example 4 


Let the average-revenue function be 

AR = f(Q) = 8,000 - 23Q + 1.1 Q 2 - 0.018Q 3 

As can be verified (see Exercise 9.4-7), this function gives rise to a downward-sloping AR 
curve, as is appropriate for a firm under imperfect competition. Since 

MR = f(Q) + Qf'(Q) = 8,000 - 46Q+ 3.3Q 2 - 0.072Q 3 

it follows that the slope of MR is 

-^-MR = -46 + 6.6Q - Q.216Q 2 
dQ 

Because this is a quadratic function and since the coefficient of Q 2 is negative, dMR/dQ must 
plot as an inverse-U-shaped curve against Q, such as shown in Fig. 9.5a, If a segment of this 
curve happens to lie above the horizontal axis, the slope of MR will take positive values. 


f This point is emphatically brought out in |ohn P. Formby, Stephen Layson, and W. lames Smith, 
"The Law of Demand, Positive Sloping Marginal Revenue, and Multiple Profit Equilibria,’’ Economic 
Inquiry, April 1982, pp. 303-311. 



Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 241 


Setting dMR/dQ = 0, and applying the quadratic formula, we find the two zeros of the 
quadratic function to be Qi = 10.76 and Qj = 1 9.79 (approximately). This means that, for 
values of Q in the open interval (Qi, Q 2 ), the dMR/dQ curve does lie above the horizontal 
axis, Thus the marginal-revenue curve indeed is positively sloped for output levels between 
Qi and Q 2 . 

The presence of a positively sloped segment on the MR curve has interesting implica¬ 
tions. Such an MR curve may produce more than one intersection with the MC curve 
satisfying the second-order sufficient condition for profit maximization, While afl such 
intersections constitute local optima, however, only one of them is the global optimum that 
the firm is seeking, 


EXERCISE 9,4 

1. Find the relative maxima and minima of y by the second-derivative test: 

{ 0 ) y = -2x 2 + 8x + 25 (c) y=.|x J - 3x J -f 5x + 3 

{b) y=x 3 +6x 2 +9 ( X? 4) 

2. Mr. Greenthumb wishes to mark out a rectangular flower bed, using a wall of his house 
as one side of the rectangle. The other three sides are to be marked by wire netting, of 
which he has only 64 ft available.:What are the length L and width W of the rectangle 
that would give him the largest possible planting area? How do you make sure that 
your answer gives the largest, not the smallest area? 

3. A firm has the following total-cost and demand functions: 

C = 1q 3 -7Q2 + - ]1iQ + 50 

Q = 100 - P 

(a) Does the total-cost function satisfy the coefficient restrictions of (9.5)7 

(b) Write out the total-revenue function R in terms of Q. 

(c) Formulate the total-profit function ,t in terms of Q. 

(d) Find the profit-maximizing level of output Q*. 

(e) What is the maximum profit? 

4. If coefficient b in (9.3) were to take a zero value, what would happen to the marginal- 
cost and total-cost curves? 

5. A quadratic profit function jt(Q) = hQ 2 + jQ + k is to be used to reflect the following 
assumptions: 

(a) If nothing is produced, the profit will be negative (because of fixed costs), 

(b) The profit function is strictly concave. 

(c) The maximum profit occurs at a positive output level Q*. 

What parameter restrictions are called for? 

6. A purely competitive firm has a single variable input L (labor), with the wage rate Wo 
per period. Its fixed inputs cost the firm a total of F dollars per period. The price of the 
product is Pq. 

(a) Write the production function, revenue function, cost function, and profit function 
of the firm. 
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(b) what is the first-order condition for profit maximization? Give this condition an 
economic interpretation. 

(c) What economic circumstances would ensure that profit is maximized rather than 
minimized? 

7. Use the following procedure to verify that the AR curve in Example 4 is negatively 

sloped: 

(<j) Denote the slope of AR by 5, Write an expression for 5. 

(b) Find the maximum value of S, by using the second-derivative test. 

(c) Then deduce from the value of S mm that the AR curve is negatively sloped 
throughout. 


9.5 Maclaurin and Taylor Series _ 

The time has now come for us to develop a test for relative extrema that can apply even 
when the second derivative turns out to have a zero value at the stationary point. Before we 
can do that, however, it is first necessary to discuss the so-called expansion of a function 
v — f(x) into what are known, respectively, as a Maclaurin series (expansion around the 
point x = 0) and a Taylor series (expansion around any point x = x 0 ). 

To expand a function y = fix) around a point means, in the present context, to trans¬ 
form that function into a polynomial form, in which the coefficients of the various terms are 
expressed in terms of the derivative values /'(*o), f"(x o), etc.—all evaluated at the point 
of expansion -to- In the Maclaurin series, these will be evaluated at x — 0; thus we have 
/'(0), /"(0), etc., in the coefficients. The result of expansion is a power series because, 
being a polynomial, it consists of a sum of power functions. 

Maclaurin Series of a Polynomial Function 

Let us consider first the expansion of a polynomial function of the /ith degree, 

f{x) = a 0 + a 2 x 2 + a- ! ,x i -ft ux A + • • • + a„x" (9.6) 

into an equivalent nth-degree polynomial where the coefficients (at), a\, etc.) are expressed 
instead in terms of the derivative values /'(0), /"(Q), etc. Since this involves the transforma¬ 
tion of one polynomial into another of the same degree, it may seem a sterile and purposeless 
exercise, but actually it will serve to shed much light on the whole idea of expansion. 

Since the power series after expansion will involve the derivatives of various orders of 
the function^ let us first find these. By successive differentiation of (9.6), we can get the 
derivatives as follows: 

fix) = a\ + layx + 3 a 2 x 2 + 4« 4 x 3 H-+ 

f"(x) = 2a 2 + 3(2 )aix + 4(3) t j 4 * 2 + ••■ + «(«- 1 )a„x H ~ 2 
f"'(x) = 3(2)« 3 + 4(3)(2)a 4 .r + • • ■ n(n - l)(fl - 2 )a n x n ~ 2 

= 4(3)(2)a 4 + 5(4)(3)(2)a 3 x H- +/*(«- 1)(» - 2)(« - 3 fax”- 4 


f*\x) = n(n - 1)(« - 2)(«-3)-- (3)(2)(l)a. 
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Note that each successive differentiation reduces the number of terms by one- the additive 
constant in front drops out—until, in the «th derivative, we are left with a single product 
term (a constant term). These derivatives can be evaluated at various values of x; here we 
shall evaluate them at x = 0, with the result that all terms involving* will dropout. We are 
then left with the following exceptionally neat derivative values: 

/'(0) = a, /"(0) - 2^2 AO) = 3(2)o, / ,4> (0) - 4{3)(2)« 4 

■ • ■ A(0) = n(n - 1)(« - 2 ){n - 3) ■ ■ - (3)(2)dH (9.7) 

If we now adopt a shorthand symbol «! (read: “« factorial''), defined as 


«! = n(n - I)(/i - 2) • -(3)(2)(l) (« = a positive integer) 

so that, for example, 2! = 2 x 1 = 2 and 3! = 3 x 2 x 1 = 6, etc. (with 0! defined as 
equal to 1), then the result in (9.7) can be rewritten as 


AO) 




1! 




AO) 


2! 


o 3 - 


AO) 

3! 


Q4 


A(Q) 

4! 


A( 0) 


Substituting these into (9.6) and utilizing the obvious fact that /(0) = wc can now 
express the given function /(x) as a new, but equivalent, same-degree polynomial in which 
the coefficients are expressed in terms of derivatives evaluated at x = 0; r 


m = m + m, + m* + ^ 


0 ! 


1 ! 2 ! 

/A0) „ 


3! 


[Maelaurin's formula] 


(9.8) 


This new polynomial, called the Maclaurin series of the polynomial function f(x), repre¬ 
sents the expansion of the function f(x) around zero (x = 0). Note that the point of 
expansion (here, 0) is simply the value of x that will be used to evaluate f(x) and all its 
derivatives. 


Example 1 


Find the Maclaurin series for the function 

f(x) = 2 + 4x + 3x 2 ( 9 . 9 ) 

This function has the derivatives 


f'(x) = 4 + 6x 
f"(x) = 6 


so that 


1'{ 0) = 4 
f"(0) = 6 


Thus the Maclaurin series is 

m= f(o) + r(o)x+-^x 2 

= 2 + 4x + 3 * 2 


The previous line verifies that the Maclaurin series does indeed correctly represent the given 
function. 


* Since 0! = 1 and 1! = 1, the first two terms on the right of the equals sign in (9.8) can be written 
more simply as f(0), and f\ 0)x, respectively. We have included the denominators 0! and 1! here to 
call attention to the symmetry among the various terms In the expansion. 
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Taylor Series of a Polynomial Function 

More generally, the polynomial function in (9.6) can be expanded around any point xo, not 
necessarily zero. In the interest of simplicity, we shall explain this by means of the specific 
quadratic function in (9.9) and generalize the result later. 

For the purpose of expansion around a specific point x 0 , we may first interpret any given 
value of x as a deviation from xo. More specifically, we shall let x = xq + 5, where 5 
represents the deviation from the value xo. Upon such interpretation, the given function 

(9.9) and its derivatives now become 

fix) — 2 + 4(xo -f (5) + 3(xo + fl) 2 

/(*) = 4 + 6(x 0 -M) (9.10) 

AO = & 

We know that the expression (x 0 + 5) = x is a variable in the function, but since x 0 in the 
present context is a fixed (chosen) number, only 8 can be properly regarded as a variable in 

(9.10) . Consequently, f(x) is in fact a function of <5, say, g(S): 

g(<5) = 2 + 4(xu + 5) + 3(x 0 + <5) 2 (3/(*)] 


with derivatives 

S'{S) = 4 + 6(X 0 + S) [= f(x)\ 

gV) = 6 [=/"(*)] 


We already know how to expand g(S) around zero (8 = 0). According to (9.8), such an 
expansion will yield the following Maclaurin scries: 


g{5) 


gi 0 ) 

0 ! 



2 ! 


(9.11) 


But since we have letx - xu + S, the fact that <5 = 0 impliesx = xo; hence, on the basis of 
the identity g(S) = fix), we can write for the case of 5 = 0: 

g( 0) = /'(xo) g'(0) = /'(xo) g"(0) = /"(xo) 


Upon substituting these into (9.11), we find the result to represent the expansion of /(x) 
around the point xo, because the coefficients now involve the derivatives f'{x o), /"(xo), 
etc.,allevaluatedatx =x t) \ 


/(x)[=g(5)] 


/(xq) 

0 ! + 


/'(xo) 

1 ! 


(x - xo) + 


/"(xo) 

2 ! 


(x - x 0 ) 2 


(9.12) 


You should compare this result—the Taylor polynomial of f(x) —with the Maclaurin 
polynomial ofg(6) in (9.11). 

Since for the specific function under consideration, (9.9), we have 

/(x 0 ) — 2 + 4x{i + 3xg f'(x 0 ) = 4 + 6xo /Vo) = 6 
the Taylor polynomial in (9.12) becomes 

f(x) = 2 + 4xo + 3*o + ( 4 + 6xo)(x - x 0 ) + \(x - x 0 ) 2 
— 2 + 4x + 3x 2 

This verifies that the Taylor polynomial does correctly represent the given function. 



Example 2 
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The expansion formula in (9.12) can be generalized to apply to the wth-degree polyno¬ 
mial of (9.6). The generalized formula is 


,, , /(*o) r /'(*(>), , , /"(*<)), ,2 , 

fix) = - Xq) + —rr~(x ~ Xq ) 2 + ■■■ 


+ 


0 ! 1 ! 

f (n \x 0 ) 


2 ! 


n\ 


x - X( } ) n [Taylor’s formula] 


(9.13) 


This differs from Maclaurin’s formula in (9.8) only in the replacement of zero by *0 as the 
point of expansion, and in the replacement of x by the expression (x - * 0 ). What (9.13) 
tells us is that, given an nth-degree polynomial /(*), if we let x = 7 (say) in the terms on 
the right of(9.13), select an arbitrary number xo,then evaluate and add these terms, we will 
end up exactly with /(7)—the value of /(x) at x = 7. 


Taking xo = 3 as the point of expansion, we can rewrite (9.6) equivalently as 
f(x) = f (3) + f'(3)(x - 3) + -^(x - 3) 2 + - ■ ■ + ^>(x - 3) n 


Expansion of an Arbitrary Function 

Heretofore, we have shown how an nth-degree polynomial function can be expressed in 
another, equivalent, nth-degree polynomial form. As it turns out, it is also possible to 
express any arbitrary function (p(x) —one that is not necessarily a polynomial—in a poly¬ 
nomial form similar to (9.13), provided < p(x) has finite, continuous derivatives up to the 
desired order at the expansion point x y . 

According to a mathematical proposition known as Taylor s theorem, given an arbitrary 
function 0(x), if we know the value of the function at x = xo [that is, 0(xy)] and the val¬ 
ues of its derivatives at x 0 [that is, 4>'(x 0 ), 0"(*o), etc.], then this function can be expanded 
around the point xy as follows (« — a fixed positive integer arbitrari ly chosen): 


0(x) = 




0 ! 1 ! 
<> M (x 0 ) 


+ ••• + 


n ! 


(X-XQ )* 


2 ! 

R„ 


= P n -\-R„ [Taylor’s formula with remainder] (9.14) 


where P n represents the (bracketed) wth-degree polynomial [the first (« + 1) terms on the 
right], and R n denotes a remainder, to be explained on page 248. r The presence of R„ is 
what distinguishes (9.14) from Taylor’s formula (9.13), and for this reason (9.14) is called 
Taylor's formula with remainder. The form of the polynomial P„ and the size of the 
remainder R„ will depend on the value of n we choose. The larger the n, the more terms 
there will be in P„\ accordingly, R„ will in general assume a different value for each dif¬ 
ferent n, This fact explains the need for the subscript n in these two symbols. As a memory 
aid, we can identify n as the order of the highest derivative in P„. (In the special case of 
rt = 0, no derivative will appear in P n at all.) 


* The symbol R„ (remainder) i$ not to be confused with the symbol R n (n- space). 
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Example 3 


The appearance of R n in (9.14) is due to the fact that we are here dealing with an arbi¬ 
trary function 4> which cannot always be transformed exactly into, but can only be approx¬ 
imated by, the polynomial form shown in (9.13). Therefore, a remainder term is included as 
a supplement to the f„ part, to represent the discrepancy between and P„. Thus, P„ 
constitutes a polynomial approximation to <p(x), with the term R„ as a measure of the error 
of approximation. If we choose n = 1, for example, we have 

4>{x) = [<P(x o) + <t>'(xn)(x - x 0 )] + R\ = P] + R\ 


where P\ consists of n + 1 = 2 terms and constitutes a linear approximation to <p(x). If we 
choose « - 2, a second-power term will appear, so that 


<t>(x) = 


4>(Xfi) + <t>'(xo)(x 


X ° + ~~2\~ * x _Xfi 


4- R 2 = P 2 + R 2 


where P 2 , consisting of n + 1 = 3 terms, is a quadratic approximation to 4>(x). And so 
forth. The fact that we can create polynomial approximations to any arbitrary function (pro¬ 
vided it has finite, continuous derivatives) is of great practical significance. Polynomial 
functions—even higher-degree ones—are relatively easy to work with, and if they can 
serve as good approximations to some difficult functions, much convenience is to be 
gained, as the next two examples will illustrate. 

We should point out that the arbitrary function could obviously encompass the nth- 
degree polynomial of (9.6) as a special case. For this latter case, if the expansion is into 
another nth-degree polynomial, the result of (9.13) will exactly apply; or in other words, we 
can use the result in (9.14), with R n = 0, However, if the given nth-degree polynomial fix) 
is to be expanded into a polynomial of a lesser degree, then the latter can only be consid¬ 
ered an approximation to /(x), and a remainder must appear; in that case, the result in 
(9.14) can be applied with a nonzero remainder. Thus Taylor’s formula in the form of (9.14) 
is perfectly general. 


Expand the non polynomial function 


^ (x> = TT* 

around the point x 0 = 1, with n = 4. We shall need the first four derivatives of 4>(x), which 

-1 


are 


&'{ x ) — —(1 + x )~ 2 
4>"(x) = 2(] +xy } 
0'"(*) = -6(1+*r 4 

0< 4 >(x) = 24(1 +x)- 5 


so that $'(1) = -(2) 2 = 
f'(l} = 2(2) 3 = 1 


^(l) = -6(2)- 4 = -^ 


0 (4) (1) = 24(2) -5 = - 


Also, we see that <*>(!) = \ - Thus, setting x 0 = 1 in (9.14) and utilizing the obtained deriva- 
tives, we arrive at the following Taylor series with remainder: 


31 13 1 2 3 3 1 4 _ 

-32“i6 X+ r '16 X + 32* +/?4 
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It is possible, of course, to choose x 0 = 0 as the point of expansion here, too. In that 
case, with x 0 set equal to zero in (9.14), the expansion will result in a Madaurin series with 
remainder. 

Example 4 Expand the quadratic function 

4»{x) = 5 + 2x + x 2 

around Xo = 1, with n = 1. This function is, like (9.9) in Example 1, a second-degree poly¬ 
nomial. But since n = 1, our assigned task is to expand it into a first-degree polynomial, 
i.e., to find a linear approximation to the given quadratic function; thus a remainder term 
is bound to appear. For this reason, <p(x) should be viewed as an "arbitrary" function for the 
purpose of this Taylor expansion. 

To carry out this expansion, we need only the first derivative 4>'(x) = 2 + 2x. Evaluated 
at Xo = 1, the given function and its derivative yield 

o)=<Kl) = 8 ^'(*b)=0'(1) = 4 

Thus Taylor's formula with remainder gives us 

= <M*o) + <A'(*o)(x - x 0 ) + fii 
= 8 + 4(x-1) + fi, = 4 + 4x + fii 

where the (4 + 4x) term is a linear approximation and the fii term represents the error of 
approximation. 

In Fig. 9.8, 4>(x) plots as a parabola, and its linear approximation as a straight line tan¬ 
gent to the </>(x) curve at the point (1,8). The occurrence of the point of tangency at x = 1 
is not a matter of coincidence; rather, it is the direct consequence of the fact that the point 
of expansion is set at that particular value of x. This suggests that, when an arbitrary func¬ 
tion 0(x) is approximated by a polynomial, the latter will give the exact value of <fr(x) at 
(and only of) the point of expansion, with zero error of approximation (fii =0). Elsewhere, 
fii is strictly nonzero and, in fact, shows increasingly larger errors of approximation as we 
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try to approximate 4 >(x) for x values farther and farther away from the point of expansion 
* 0 . Thus, when attempting to approximate any function by a polynomial, if we are 
most interested in obtaining an accurate approximation in the neighborhood of a specific 
value of x, say xq, then we ought to choose xq as the point of expansion. 

The construction of Fig. 9.8 is strongly reminiscent of Fig. 8.1. Indeed, both figures are 
concerned with "approximations." But there is a difference in the scope of approximation. 
In Fig. 8.1, we attempt to approximate Ay by the differential dy with the help of a tangent 
line drawn at *o, a given starting value of x. In Fig. 9.8, on the other hand, we aim more 
broadly to approximate an entire curve by a particular straight line, i.e., to approximate the 
height of the curve at any value of x, say, by the corresponding height of the straight line 
at X]. Note that, in both cases, the error of approximation varies with the value of x. In 
Fig. 8.1, the error (the difference between dyand Ay) gets smaller as Ax gets smaller, or as 
x gets closer to xq, at which the tangent line is drawn. In Fig. 9.8, the error (the vertical 
discrepancy between the straight line and the curve) gets smaller as x approaches xo, the 
chosen point of expansion. 


Lagrange Form of the Remainder 

Now we must comment further on the remainder term, According to the Lagrange form of 
the remainder, we can express R„ as 


R 


n 


(«+ 1 )! 


[X - .Vo)" +l 


(9.15) 


where p is some number between x (the point where wc wish to evaluate the arbitrary func¬ 
tion 0) and Xo (the point where we expand the function 0). Note that this expression closely 
resembles the term which should logically follow the last term in P n in (9.14), except that 
the derivative involved is here to be evaluated at a point p instead of xy. Since the point p 
is, unfortunately, not otherwise specified, this formula does not really enable us to calculate 
R»\ nevertheless, it does have great analytical significance. Let ns therefore illustrate its 
meaning graphically, although we shall do it only for the simple case ofn = 0. 

When n = 0, no derivatives whatever will appear in the polynomial part P 0 ; therefore 
(9.14) reduces to 


<h(x) = P 0 + R< } = 0(*q) + <P\p)(x - xo) 
or <[>(x) -0(x o ) =0'(p){x -x„) 

This result, a simple version of the mean-value theorem, states that the difference between 
the value of the function <p at x 0 and at any other x value can be expressed as the product 
of the difference (x - x u ) and the derivative (p 1 evaluated at p (with p being some point 
between x and xo). Let us look at Fig. 9.9, where the function fix) is shown as a continu¬ 
ous curve with derivative values defined at all points. Let be the chosen point of expan¬ 
sion, and let xbe any point on the horizontal axis. If we try to approximate </>(.r), or distance 
xB , by 0Uo), or distance x„A, it will involve an error equal to <j>(x) - 0(x o ), or the 
distance CB. What the mean-value theorem says is that the error CB which constitutes 
the value of the remainder term ft, in the expansion—can be expressed as <!>'( p)(x - xo), 
where p is some point between x and x 0 . First we locate, on the curve between points 
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A and B, a point D such that the tangent line at D is parallel to line AB\ such a point D must 
exist, since the curve passes from A to B in a continuous and smooth manner. Then, the 
remainder will be 


CB 

R g = CB = — AC = (slope of AB) - AC 
AC 

= (slope of tangent at D) ■ AC 
= (slope of curve at x - p) ■ AC 
= 4>'(p)(x -xn) 

where the point p is between x and * 0 , as required. This demonstrates the rationale of the 
Lagrange form of the remainder for the case n = 0. We can always express Rq as 
4>'{p)(x — x<\) because, even though p cannot be assigned a specific value, we can be sure 
that such a point exists. 

Equation (9.15) provides a way of expressing the remainder term R„, but it does not 
eliminate R n as a source of discrepancy between <p(x) and the polynomial P„. However, if 
it happens that as we increase n (thus raising the degree of the polynomial) indefinitely, we 
find that 


R n —* 0 as n -* 00 so that P n —* <p(x ) as n oo 

then the Taylor series is said to be convergent to <p(x) at the point of expansion, and the 
Taylor series can be written as a convergent infinite series as follows: 


Mx) = 


0(*o) 

0 ! 



(x -*(,) + 


r(x 0 ) 

2 ! 


(v - v 0 ) 2 + 


(9.16) 


Note that the R„ term is no longer shown; in its place is an ellipsis signifying that the poly¬ 
nomial contains an infinite number of subsequent terms whose mathematical structures 
follow the pattern indicated by the previous terms. In this (convenient) event, it will be pos¬ 
sible to make P n as accurate an approximation to <j>{x) as we desire by choosing a large 
enough value for n, that is, by including a large enough number of terms in the polynomial 
P n . An important example of this will be discussed in Sec. 10,2. 
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EXERCISE 9.5 


1. Find the value of the following factorial expressions: 
(o)5l 


(e) 


(n + 2)! 

n\ 


m si <d) ^ 

2. Find the first five terms of the Maclaurin series (i.e., choose n = 4 and let *o = 0) for; 

(0) ^(X) = (*0 *(*) = 

3. Find the Taylor series with n = 4 and xq = -2, for the two functions in Prob, 2. 

4. On the basis of Taylor's formula with the Lagrange form of the remainder [see (9.14) 
and (9.15)3, show that at the .point of expansion (x * *o) the Taylor series will always 
give exactly the value of the function at that point, (p(xo) r not merefyan approximation. 


9.6 Nth-Derivative Test for Relative 

Extremum of a Function of One Variable 

The expansion of a function into a Taylor (or Maclaurin) series is useful as an approxima¬ 
tion device in the circumstance that R„ -*■ 0 as n -*■ oc, but our present concern is with its 
application in the development of a general test for a relative extremum. 


Taylor Expansion and Relative Extremum 

As a preparatory step for tliat task, let us redefine a relative extremum as follows; 

A function f{x) attains a relative maximum (minimum) value at x 0 if f(x) - /(xq) is 
negative (positive) for values ofx in the immediate neighborhood of aq> both to its left and 
to its right. 


This can be made clear by reference to Fig. 9.10, where xi is a value of x to the left of xo, 
and a*2 is a value of a to the right of.ro. In Fig. 9.10u, f(xy) is a relative maximum; thus 
/(Ay) exceeds both f{x 0 and fix 2 >. In short, f(x) - fix 0 ) is negative for any value of a 
in the immediate neighborhood of *<>. The opposite is true of Fig. 9,106, where /(ao) is a 
relative minimum, and thus f(x) - /(xq) > 0. 

Assuming f (a) to have finite, continuous derivatives up to the desired order at the point 
a =xo, the function fix )—not necessarily polynomial—can be expanded around the 
point xo as a Taylor series, On the basis of (9.14) (after duly changing <p to/), and using the 
Lagrange form of the remainder, we can write 


f{x) - /(x 0 ) = fix oH* - Jo) + - xq ) 2 + ■■■ 

f"(x o) , r +]) (p) i v , + 

i-;-(A — Aq) +- - (x - Xfl) 

nl /H-1 ! 


in + \)\ 


(9.17) 
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FIGURE 9.10 




If the sign of the expression f(x) - /(xo) can be determined for values of x to the imme¬ 
diate left and right ofxo, we can readily come to a conclusion as to whether f(x o) is an 
extremum, and if so, whether it is a maximum or a minimum. For this, it is necessary to 
examine the right-hand sum of (9,17). Altogether, there are (« + 1) terms in this sum —n 
terms from P n , plus the remainder which is in the (n 4- l)st degree—and thus the actual 
number of terms is indefinite, being dependent upon the value of n we choose. However, by 
properly choosing n, we can always make sure that there will exist only a single term on the 
right. This will drastically simplify the task of evaluating the sign of f(x) - f(x») and 
ascertaining whether /(xq) is an extremum, and if so, which kind. 

Some Specific Cases 

This can be made clearer through some specific illustrations. 

Casel f(x o)/0 

If the first derivative at x<j is nonzero, let us choose n = 0, so that the remainder will be 
in the first degree. Then there will be only « + L = 1 term on the right side, implying that 
only the remainder R 0 will be there. That is, we have 

/(*) - /(*o) = - -*o) = f(p)(x - * 0 ) 

where p is some number between xq and a value ofx in the immediate neighborhood of xo. 
Note that p must accordingly be very, very close to x$. 

What is the sign of the expression on the right? Because of the continuity of the deriva¬ 
tive, f{p) will have the same sign as f\x o) since, as mentioned before, p is very, very 
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close tox u , In the present case, /'( p) must be nonzero; in fact, it must be a specific positive 
or negative number. But what about the (x - x 0 ) part? When we go from the left of xo to its 
right, x shifts from a magnitude Xi < x 0 to a magnitude x 2 > x a (sec Fig. 9.10). Conse¬ 
quently, the expression {x - x 0 ) must turn from negative to positive as we move, and 
j{x) - f(xo) - f'ip)(x -x 0 ) must also change sign from the left of x 0 to its right. How¬ 
ever, this violates our new definition of a relative extremum; accordingly, there cannot exist 
a relative extremum at f(x o) when /'(xo) / 0 -a fact that is already well known to us. 


Case 2 


/'(x 0 ) = 0; /"(xo) # 0 


In this case, choose n = 1, so that the remainder will be in the second degree. Then 
initially there will be n +1 = 2 terms on the right. But one of these terms will vanish 
because /(x 0 ) = 0, and we shall again be left with only one term to evaluate: 


fix) ~ f(X o) - /'(Xft)(x - Xf,) + 
= \ f"(p){x - Xo) 2 


f’jp) 

2 ! 


(x - x 0 )- 


[because /'(xo) = 0] 


As before, f'(p) will have the same sign as /"(x 0 ), a sign that is specified and unvarying, 
whereas the (x - x u ) 2 part, being a square, is invariably positive. Thus the expression 
f(x) - /(xo) must take the same sign as f"(x o) and, according to the earlier definition of 
relative extremum, will specify 


[with/'(xo) = ft] 


A relative maximum of fix) if .//xo) < 0 
A relative minimum of fix) if f"(x o) > 0 

You will recognize this as the second-derivative test introduced earlier. 

Case 3 f(x Q ) = /''(xo) = ft, but / "'(x (J ) / 0 

Here we arc encountering a situation that the second-derivative test is incapable of han¬ 
dling, for /"(x 0 ) is now zero. With the help of the Taylor series, however, a conclusive 
result can be established without difficulty. 

Let us choose n = 2; then three terms will initially appear on the right. But two of 
these will drop out because /'(xo) = /"(x () ) = 0, so that we again have only one term to 
evaluate: 

fix) - /(Xo) = f(x 0 )(x - -*o) + ^/"(x 0 )(x - Xo) 2 + ^/"'(/>)(X - Xo) J 

= \f"ip)ix - Xo) 3 [because /'(xo) = 0, /"(x 0 ) = 0] 

As previously, the sign of /"'( p) is identical with tliato//"(x 0 ) because of the continuity 
of the derivative and because p is very close to x 0 . But the (x - x (l ) 3 part has a varying sign. 
Specifically, since (x - xo) is negative to the left of x u , so also will be (x - x 0 ) 3 ; yet, to the 
right of xo, the (x -x 0 ) 3 part will be positive. Thus there is a change in the sign of 
f(x) - f(x o) as we pass through xi h which violates the definition of a relative extremum. 
However, we know that x 0 is a critical value [/'(xo) = 0], and thus it must give an inflec¬ 
tion point, inasmuch as it docs not give a relative extremum. 


Case 4 


fix o) = /"(xo) = ••• = f«-'\xo) = 0, but/*>(xo) / o 
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This is a very general case, and we can therefore derive a general result from it. Note 
that here all the derivative values are zero until we arrive at the Mh one. 

Analogously to the preceding three cases, the Taylor series for Case 4 will reduce to 

f(x)-f(xo) = ^f N) (p){x-x 0 f 

Again, f iN \p) takes the same sign as /* V| (xo), which is unvarying. The sign of the 
(x - xq)^ part, on the other hand, will vary if N is odd (cf. Cases 1 and 3) and will remain 
unchanged (positive) if N is even (cf. Case 2). When A ls odd, accordingly, f(x) - /'(jto) 
will change sign as we pass through the point xo, thereby violating the definition of a 
relative extremum (which means that x tj must give us an inflection point on the curve). But 
when N is even, f(x) - /(x 0 ) will not change sign from the left ofx 0 to its right, and this 
will establish the stationary value /(xo) as a relative maximum or minimum, depending on 
whether /* w1 (xo) is negative or positive, 

Nth-Derivative Test 

At last, then, we may state the following general test. 

/Vth-Derivative test for relative extremum of a function of one variable If the first 
derivative of a function f(x) at xo is f(x o) = 0 and if the first nonzero derivative value at 
xo encountered in successive derivation is that of the Mh derivative, /*' v *(xo) ^ 0, then the 
stationary value /(xo) will be 

a. A relative maximum if N is an even number and f l ,V) (xo) < 0. 
h. A relative minimum if A'is an even number but / (,vl (*o) > 0. 
c. An inflection point if A 7 is odd. 

It should be clear from the preceding statement that the Mh-derivative test can work if 
and only if the function f(x) is capable of yielding, sooner or later, a nonzero derivative 
value at the critical value xo. While there do exist exceptional functions that fail to satisfy 
this condition, most of the functions we are likely to encounter will indeed produce nonzero 
f {N \x o) in successive differentiation. 1 ’ Thus the test should prove serviceable in most 
instances. 


’ If f(x) is a constant function, for instance, then obviously f'(x) = f"(x) = • • • = 0, so that no 
nonzero derivative value can ever be found. This, however, is a trivial case, since a constant function 
requires no test for extremum anyway. As a nontrivial example, consider the function 

_ jr 1 /'* (forx^O) 

Y ~ \ 0 (for x = 0) 

where the function y= e -1 '* z is an exponential function, yet to be introduced (Chap. 10). By 

itself, y = is discontinuous at x - 0, because x = 0 is not in the domain (division by zero is 

undefined). However, since lim y = 0, we can, by appending the stipulation that y = 0 for * = 0, fill 

x -*0 

the gap In the domain and thereby obtain a continuous function. The graph of this function shows 
that it attains a minimum at x = 0. But it turns out that, at x = O, all the derivatives (up to any order) 
have zero values. Thus we are unable to apply the Nth-derivative test to confirm the graphically 
ascertainable fact that the function has a minimum at x = 0. For further discussion of this exceptional 
case, see R. Courant, Differential and Integral Calculus (translated by E. j. McShane), Interscience, 

New York, vol. I, 2d ed., 1937, pp. 196,197, and 336. 
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Example 1 


Examine the function y = (7 - x) 4 for its relative extremum. Since f'(x) = -4(7 - x) 3 is 
zero when x = 7, we take x = 7 as the critical value for testing, with y = 0 as the stationary 
value of the function. By successive derivation (continued until we encounter a nonzero 
derivative value at the point x = 7), wc get 

r(x) = 12(7-x) 2 so that 

f"'(x) = -24(7 - x) t"’{7) = 0 

f l4) (x) = 24 / (4) (7) = 24 

Since 4 is an even number and since f H) (7) is positive, we conclude that the point (7, 0) 
represents a relative minimum. 

As is easily verified, this function plots as a strictly convex curve. Inasmuch as the second 
derivative at x = 7 is zero (rather than positive), this example serves to illustrate our earlier 
statement regarding the second derivative and the curvature of a curve (Sec. 9.3) to the 
effect that, while a positive f”(x) for all x does imply a strictly convex f (x), a strictly convex 
f(x) does nofimply a positive f"(x) for all x. More importantly, it also serves to illustrate the 
fact that, given a strictly convex (strictly concave) curve, the extremum found on that curve 
must be a minimum (maximum), because such an extremum will either satisfy the second- 
order sufficient condition, or, failing that, satisfy another (higher-order) sufficient condition 
for a minimum (maximum). 


EXERCISE 9.6 

1. Find the stationary values of the following functions: 

(o) y- x 3 (*>) y = -x 4 (c) y = x 6 + 5 

Determine by the Nth-derivative test whether they represent relative maxima, relative 

minima, or inflection points. 

2. Find the stationary values of the following functions: 

<o)y = (x-1 ) 3 + 16 (c) y = (3-x>* + 7 

( 6 )y = <*-2 ) 4 (d)y = ( 5-2x) 4 +8 

Use the Ntb-derivative test to determine the exact nature of these stationary values. 
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The Mh-derivative test developed in Chap. 9 equips us for the task of locating the extreme 
values of any objective function, as long as it involves only one choice variable, possesses 
derivatives to the desired order, and sooner or later yields a nonzero derivative value at the 
critical value * 0 . In the examples cited in Chap. 9, however, we made use only of polyno¬ 
mial and rational functions, for which we know how to obtain the necessary derivatives. 
Suppose that our objective function happened to be an exponential one, such as 

Then we are still helpless in applying the derivative criterion, because we have yet to learn 
how to differentiate such a function. This is what we shall do in the present chapter, 

Exponential functions, as well as the closely related logarithmic functions, have impor¬ 
tant applications in economics, especially in connection with growth problems, and in eco¬ 
nomic dynamics in general. The particular application relevant to the present part of the 
book, however, involves a class of optimization problems in which the choice variable is 
time. For example, a certain wine dealer may have a stock of wine, the market value of 
which is known to increase with time in some prescribed fashion. The problem is to deter¬ 
mine the best lime to sell that stock on the basis of the wi ne-value function, after taking into 
account the interest cost involved in having the money capital tied up in that stock. Expo¬ 
nential functions may enter into such a problem in two ways. First, the value of the wine 
may increase with time according to some exponential law of growth. In that event, we 
would have an exponential wine-value function. Second, when we consider the interest 
cost, the presence of interest compounding will surely introduce an exponential function 
into the picture. Thus we must study the nature of exponential functions before we can 
discuss this type of optimization problem, 

Since our primary purpose is to deal with time as a choice variable, let us now switch to 
the symbol t —in lieu of x —to indicate the independent variable in the subsequent discus¬ 
sion. (However, this same symbol t can very well represent variables other than time also.) 
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10.1 The Nature of Exponential Functions _ 

As introduced in connection with polynomial functions, the term exponent means an indi¬ 
cator of the power to which a variable is to be raised. In power expressions such as x* or x 5 , 
the exponents are constants; but there is no reason why we cannot also have a variable 
exponent, such as in 3 r or 3', where the number 3 is to be raised to varying powers (various 
values of x or /). A function whose independent variable appears in the role of an exponent 
is called an exponential function. 

Simple Exponential Function 

In its simple version, the exponential function may be represented in the form 

y = m = b ! (b>\) (10.1) 

where y and t are the dependent and independent variables, respectively, and b denotes a 
fixed base of the exponent. The domain of such a function is the set of all real numbers. 
Thus, unlike the exponents in a polynomial function, the variable exponent t in (10.1) is not 
limited to positive integers -unless we wish to impose such a restriction. 

But why the restriction of b > 1? The explanation is as follows. Since the domain of 
the function jn (10.1) consists of the set of all real numbers, it is possible for t to take a 
value such as j. If b is allowed to be negative, the half power of b will involve taking the 
square root of a negative number. While this is not an impossible task, we would certainly 
prefer to take the easy way out by restricting b to be positive. Once we adopt the restriction 
h > 0, however, we might as well go all the way to the restriction b > 1: The restriction 
h > 1 differs from b > 0 only in the further exclusion of the cases of (1) 0 < b < 1 and 
(2) b = 1; but as will be shown, the first case can be subsumed under the restriction b > 1, 
whereas the second case can be dismissed outright. Consider the first case. If b = then 
we have 



This shows that a function with a fractional base can easily be rewritten into one with a base 
greater than 1. As for the second case, the fact that b — 1 will give us the function 
y = l r = 1, so that the exponential function actually degenerates into a constant function; 
it may therefore be disqualified as a member of the exponential family 

Graphical Form 

The graph of the exponential function in (10.1) takes the general shape of the curve in 
Fig. 10.1. The curve drawn is based on the value b = 2; but even for other values of b, the 
same general configuration will prevail. 

Several salient features of this type of exponential curve may be noted. First, it is con¬ 
tinuous and smooth everywhere; thus the function should be everywhere differentiable. As 
a matter of fact, it is continuously differentiable any number of times. Second, it is strictly 
increasing, and in fact y increases at an increasing rate throughout. Consequently, both 
the first and second derivatives of the function y = b 1 should be positive—a fact we should 


Chapter 10 Exponential and Logarithmic Functions 257 



be able to confirm after we have developed the relevant differentiation formulas. Third, 
we note that, even though the domain of the function contains negative as well as positive 
numbers, the range of the function is limited to the open interval (0, oo). That is, the 
dependent variable y is invariably positive, regardless of the sign of the independent 
variable t. 

The strict monotonicity of the exponential function has at least two interesting and sig- 
nificant implications. First, we may infer that the exponential function must have an inverse 
function, which is itself strictly monotonic. This inverse function, we shall find, turns out to 
be a logarithmic function. Second, since strict monotonicity means that there is a unique 
value of t for a given value ofy and since the range of the exponential function is the inter¬ 
val (0, co), it follows that we should be able to express any positive number as a unique 
power of a base b > 1. This can be seen from Fig. 10.1, where the curve ofy = 2' covers 
all the positive values ofy in its range; therefore any positive value ofy must be expressible 
as some unique power of the number 2. Actually, even if the base is changed to some other 
real number greater than 1, the same range holds, so that it is possible to express any posi¬ 
tive number y as a power of any base b > 1. 

Generalized Exponential Function 

This last point deserves closer scrutiny. If a positive y can indeed be expressed as powers of 
various alternative bases, then there must exist a general procedure of base conversion. In the 
case of the function y = 9', for instance, we can readily transform it into y = (3 2 )' = 3 2 ', 
thereby converting the base from 9 to 3, provided the exponent is duly altered from t to It. 
This change in exponent, necessitated by the base conversion, does not create any new type 
of function, for, if we let w — 2t, then y = 3 2f = 3 W is still in the form of (10.1). From the 
point of view of the base 3, however, the exponent is now 2 1 rather than t. What is the effect 
of adding a numerical coefficient (here, 2) to the exponent tl 
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FIGURE 10.2 




The answer is to be found in Fig. 10.2«i, where two curves are drawn—one for the func¬ 
tion y = /(/) = b' and one for another function y = g(f) = b 2 ‘. Since the exponent in the 
latter is exactly twice that of the former, and since the identical base is adopted for the two 
functions, the assignment of an arbitrary value t = ta in the function g and / = 2to in the 
function/must yield the same value: 

JW = gfo) - b 2t » = yo 

Thus the distance y 0 J will be half of yo AT. By similar reasoning, for any value ofy, the 
function g should he exactly halfway between the function / and the vertical axis. It may 
be concluded, therefore, that the doubling of the exponent has the effect of compressing 
the exponential curve halfway toward they axis, whereas halving the exponent will extend 
the curve away from the y axis to twice the horizontal distance. 

It is of interest that both functions share the same vertical intercept 

m = g( 0) - is = i 

The change of the exponent / to 2t, or to any other multiple of t, will leave the vertical 
intercept unaffected. In terms of compressing, this is because compressing a zero horizon¬ 
tal distance will still yield a zero distance. 

The change of exponent is one way of modifying—and generalizing- -the exponential 
function of (10.1); another way is to attach a coefficient to b', such as 2b'. [Warning: 
2 b l (2 b)‘,] The effect of such a coefficient is also to compress or extend the curve, except 
that this time the direction is vertical. In Fig. 10.2b, the higher curve represents y = 2 h\ 
and the lower one is y = b 1 . For every value of t, the former must obviously be twice as 
high, because it has ay value twice as large as the latter. Thus we have = J'K'. Note 
that the vertical intercept, too, is changed in the present case. We may conclude that 
doubling the coefficient (here, from 1 to 2) serves to extend the curve away from the hori¬ 
zontal axis to twice the vertical distance, whereas halving the coefficient will compress the 
curve halfway toward the t axis. 

With the knowledge of the two modifications just discussed, the exponential function 
y = b' can now be generalized to the form 


y = ab ct 


(10.2) 
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where a and c are “compressing” or “extending” agents. When assigned various values, 
they will alter the position of the exponential curve, thus generating a whole family of 
exponential curves (functions). If a and c are positive, the general configuration shown in 
Fig. 10.2 will prevail; if a or c or both are negative, however, then fundamental modifica¬ 
tions will occur in the configuration of the curve (see Exercise 10.1-5). 

A Preferred Base 

What prompted the discussion of the change of exponent from t to ct was the question of 
base conversion. But, granting the feasibility of base conversion, why would one want to do 
it anyhow? One answer is that some bases are more convenient than others as far as math¬ 
ematical manipulations are concerned. 

Curiously enough, in calculus, the preferred base happens to be a certain irrational num¬ 
ber denoted by the symbol e: 


e — 2.71828... 


When this base e is used in an exponential function, it is referred to as a natural exponen¬ 
tial function, examples of which are 

y = e ! y = e 3 ' y = Ae r ' 

These illustrative functions can also be expressed by the alternative notations 


y = exp(r) y — exp(3;) y = A exp(rt) 


where the abbreviation exp (for exponential) indicates that e is to have as its exponent the 
expression in parentheses, 

The choice of such an outlandish number as e = 2.71828 ... as the preferred base no 
doubt seems bewildering. But there is an excellent reason for this choice, for the function 
e 1 possesses the remarkable property of being its own derivative! That is, 



l 


a fact that reduces the work of differentiation to no work at all. Moreover, armed with this 
differentiation rule—to be proved in Section 10.5—it will also be easy to find the deriva¬ 
tive of a more complicated natural exponential function such as y = Ae n . To do this, first 
let w — rt, so that the function becomes 


y = Ae w where w = rt, and A,r are constants 
Then, by the chain rule, we can write 


dy 

dt 


dy dw 


rt 


That is, 


~A/‘=rAe n (10.3) 

dt 


The mathematical convenience of the base e should thus be amply clear, 
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EXERCISE 10.1 

1. Plot in a single diagram the graphs of the exponential functions y = 3' and y = 3 2f . 

(a) Do the two graphs display the same general positional relationship as shown in 
Fig. 10.2o? 

(ft) Do these two curves share the same y intercept? Why? 

(c) Sketch the graph of the function y = 3 3 ' in the same diagram. 

2. Plot in a single diagram the graphs of the exponential functions y = 4' and y - 3(4'). 
(a) Do the two graphs display the general positional relationship suggested in 

Fig. 10.2ft? 

(ft) Do the two curves have the same y intercept? Why? 

(c) Sketch the graph of the function y = |(4') in the same diagram. 

3. Taking for granted that e‘ is its own derivative, use the chain rule to find dy/dt for the 
following: 

(a) y = e St (ft) y = 4e v (c ) y = 6e~ h 

4. In view of our discussion about (10.1), do you expect the function y = e [ to be strictly 
increasing at an increasing rate? Verify your answer by determining the signs of the first 
and second derivatives of this function. In doing so, remember that the domain of this 
function is the set of all real numbers, i.e., the interval (-oo, cc). 

5. In (10.2), if negative values are assigned to a and c, the general shape of the curves in 
Fig. 10.2 will no longer prevail. Examine the change in curve configuration by con¬ 
trasting (a) the case of a = -1 against the case of o= 1, and (ft) the case of c = — 1 
against the case of c = 1, 

10.2 Natural Exponential Functions 
and the Problem of Growth 


The pertineni questions still unanswered arc: How is I lie number e defined? Does it have 
any economic meaning in addition to its mathematical significance as a convenient base? 
And, in what ways do natural exponential functions apply to economic analysis? 

The Number e 

Let us consider the following function: 

i Y' J 

l+- (10.4) 

m ) 

If larger and larger values are assigned to m. then f{w) will also assume larger values: 
specifically, we find that 

/'(!) = (!+■})’= 2 
/f2)=(l + |) 2 =2.25 
/(3) = (1 + = 2.37037... 

./(4) = (l + j} 4 = 2,44141 ... 
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Moreover, if m is increased indefinitely, then f(m) will converge to the number 
2.71828... = e\ thus e may be defined as the limit of (10.4) asm ->• oo: 

e= lim f(m)= lim (l + —\ (10.5) 

m—*■«* \ ftx J 


That the approximate value of e is 2.71828 can be verified by finding the Maclaurin 
series of the function 4>(x) = e *—with x used here to facilitate the direct application of the 
expansion formula (9,14). Such a series will give us a polynomial approximation to e x , and 
thus the value of e (= e l ) may be approximated by setting x = 1 in that polynomial. If the 
remainder term R„ approaches zero as the number of terms in the series is increased indef¬ 
initely, i.e., if the series is convergent to 4>(x), then we can indeed approximate the value of 
e to any desired degree of accuracy by making the number of included terms sufficiently 
large. 

To this end, we need to have derivatives of various orders for the function. Accepting the 
fact that the first derivative of e x is e x itself, we can see that the derivative of <jt(x) is simply 
e x and, similarly, that the second, third, or any higher-order derivatives must be e x as well. 
Hence, when we evaluate all the derivatives at the expansion point fro = 0), we have the 
gratifyingly neat result 


f(0) = <p"( 0) = ■ • • = <p (K> ( 0) = e° = 1 
Consequently, by setting *o = 0 in (9.14), the Maclaurin series of e x is 

=*,)= m +« 0 ), + m ,* ++ ... ++ 


2 ! 


3! 


tv. 


1 , 1 , 1 

— l+^ + — x + —x -\ -H -x + R„ 

2! 3! nl 

The remainder term R„ , according to (9.15), can be written as 




H+l 


(« + l)! 


(« + !)! 


[0 (,l+1| (jt) = e*: :.4> ln+[ \p)^e ! '] 


Inasmuch as the factorial expression (n + 1)1 increases in value more rapidly than the 
power expression x" +1 (for a finite x) as n increases, it follows that R„ -> 0 as n oo. 
Thus the Maclaurin series converges, and the value of e' may, as a result, be expressed as a 
convergent infinite series as follows; 

f = ] + X + Y t x 2 4-+ ^.t 5 + ■ ■ - ( 10 ‘ 6 ) 

As a special case, for x = I, we find that 

111 ! 

e “ 1 + 1 + ^ + 3! + 4! + ii + '" 

= 2 + 0.5 + 0.1666667 + 0.0416667 + 0.0083333 +0.0013889 
+ 0.0001984 + 0.0000248 + 0.0000028 + 0.0000003 + • • • 

= 2.7182819 
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Thus, if we want a figure accurate to five decimal places, we can write e = 2.71828. Note 
that we need not worry about the subsequent terms in the infinite scries, because they will 
be of negligible magnitude if we are concerned only with five decimal places. 

An Economic Interpretation of e 

Mathematically, the number e is the limit expression in (10.5). But does it also possess 
some economic meaning? The answer is that it can be interpreted as the result of a special 
mode of interest compounding. 

Suppose that, starting out with a principal (or capital) of $1, we find a hypothetical 
banker to offer us the unusual interest rate of 100 percent per annum ($1 interest per year). 
If interest is to be compounded once a year, the value of our asset at the end of the year will 
be $2; we shall denote this value by T(l), where the number in parentheses indicates the 
frequency of compounding within 1 year: 

|/(1) = initial principal (1 + interest rate) 

= 1(1 + 100 %) = (1 + I ) 1 = 2 

If interest is compounded semiannually, however, an interest amounting to 50 percent 
(half of 100 percent) of principal will accrue at the end of 6 months. Wc shall therefore have 
SI.50 as the new principal during the second 6-month period, in which interest will be 
calculated at 50 percent of $1.50. Thus our year-end asset value will be 1.50(1 + 50%); 
that is, 

V(2) = (1 + 50%)( 1 + 50%) = (! + \f 

By analogous reasoning, wc can write V(3) = (1 + j) 3 , V(4) = (1 + \)\ etc.; or, in 
general, 

W=(i + ^)" 00 . 7 ) 

where m represents the frequency of compounding in 1 year. 

In the limiting case, when interest is compounded continuously during the year, i.e., 
when m becomes infinite, the value of the asset will grow in a “snowballing" fashion, 
becoming at the end of 1 year 

1 \ m 

1+- (dollars) [by (10.5)] 

«/ 

Thus, the number £ = 2.71828 can be interpreted as the year-end value to which a princi¬ 
pal of $1 will grow if interest at the rate of 100 percent per annum is compounded 
continuously. 

Note that the interest rale of 100 percent is only a nominal interest rate , for if $1 
becomes Se = $2,718 after 1 year, the effective interest, rate is in this case approximately 
172 percent per annum. 

Interest Compounding and the Function Ae rt 

The continuous interest-compounding process just discussed can be generalized iti three 
directions, to allow for: (1) more years of compounding, (2) a principal other than $1 * and 
(3) a nominal interest rate other than 100 percent. 


lim K(w) = lim 

w —*oc m -» cc 
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If a principal of $1 becomes $e after 1 year of continuous compounding and if we let Se 
be the new principal in the second year (during which every dollar will again grow into $e) 9 
our asset value at the end of 2 years will obviously become $e(e) — %e 1 . By the same token, 
it will become $e 3 at the end of 3 years or, more generally, will become $e' after t years. 

Next, let us change the principal from $1 to an unspecified amount, $J. This change is 
easily taken care of: if SI will grow into Sc' after t years of continuous compounding at the 
nominal rate of 100 percent per annum, it stands to reason that will grow into %Ae'. 

How about a nominal interest rate of other than 100 percent, for instance, r = 0.05 
{= 5 percent)? The effect of this rate change is to alter the expression Ae' to Ae rl , as can be 
verified from the following. With an initial principal of $.4, to be invested for t years at a 
nominal interest rate r, the compound-interest formula (10.7) must be modified to the form 

( r \ mi 

1 +-) ( 10 . 8 ) 

The insertion of the coefficient A reflects the change of principal from the previous level of 
$1. The quotient expression rjm means that, in each of the m compounding periods in a 
year, only 1/m of the nominal rate r will actually be applicable. Finally, the exponent ml 
tells us that, since interest is to be compounded m times a year, there should be a total of mt 
compoundings in t years. 

The formula (10.8) can be transformed into an alternative form 


V(m) = A 



m/r 


rt 





where w = 


m 

r 


( 10 . 8 ') 


As the frequency of compounding m is increased, the newly created variable w must 
increase pari passu; thus, as m -*■ c*o, we have w -+ oo, and the bracketed expression in 
(10.8'), by virtue of (10.5), tends to the number e. Consequently, we find the asset value in 
the generalized continuous-compounding process to be 

V = lim V(m) = Ae r ‘ ( 10 . 8 ") 

as anticipated. 

Note that, in (10.8), t is a discrete (as against a continuous) variable: It can only take val¬ 
ues that are integral multiples of 1 /m. For example, if m = 4 (compounding on a quarterly 
basis), then t can only take the values of j, |, 1, etc., indicating that V(m) will assume 
a new value only at the end of each new quarter. When m -> oo, as in (10.8"), however, 
1 jm becomes infinitesimal, and accordingly the variable t will become continuous. In that 
case, it becomes legitimate to speak of fractions of a year and to let t be, say, 1.2 or 2.35. 

The upshot is that the expressions e, e 1 , Ae', and A/' can all be interpreted economically 
in connection with continuous interest compounding, as summarized in Table 10.1. 

Instantaneous Rate of Growth 

It should be pointed out, however, that interest compounding is an illustrative, but not 
exclusive, interpretation of the natural exponential function Ae''. Interest compounding 
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TABLE 10.1 


Nominal 

Years of Continuous 

Asset Value, at the End of 

Continuous 

Interest 

Principal, S 

Interest Rate 

Compounding 

Compoundlng Process, $ 

Compounding 

1 

100%( = 1) 

1 

e 


1 

100% 

t 

e ‘ 


A 

100% 

t 

Ae' 


A 

r 

t 

Ae n 


merely exemplifies the general process of exponential gmwth (here, the growth of a sum 
of money capital over time), and we can apply the function equally well to the growth of 
population, wealth, or real capital. 

Applied to some context other than interest compounding, the coefficient r in Ae 1 2 ' 1 no 
longer denotes the nominal interest rate. What economic meaning does it then take? The 
answer is that r can be reinterpreted as the instantaneous rate of growth of the function 
Ae rl . (In fact, this is why we have adopted the symbol r, for rate of growth, in the first 
place.) Given the function V = Ae r ', which gives the value of Gal each point of time f, the 
rate of change of V is to be found in the derivative 

dV 

— =rAe r, =rV [see(10.3>] 
d! 

But the rate of growth of V is simply the rata of change in V expressed in relative 
(percentage) terms, i.e„ expressed as a ratio to the value of V itself. Thus, for any given 
point of time, we have 


Rate of growth of V = — — = r (10.9) 

as was slated previously. 

Several observations should be made about this rate of growth. But, first, let us clarify a 
fundamental point regarding the concept of time, namely, the distinction between a point of 
time and & period of time. The variable G(denoting a.sum of money, or the size of popula¬ 
tion, etc.) is a slock concept, which is concerned with the question; How much of it exists 
at a given moment? As such, Gis related to the point concept of time: at each point of time. 
V takes a unique value. The change in V, on the other hand, represents a flow, which 
involves the question: How much of it takes place during a given time span? Hence a 
change in V and, by the same token, the rate uf change of V must have reference to some 
specified period of time, say, per year. 

With this understanding, let us return to (10.9) for some comments: 

1. The rate ufgrowth defined in (10.9) is an instantaneous rate of growth. Since the deriv¬ 
ative dV/dt = rAe" takes a different value at a different point of /. as will V — Ae^', 
their ratio must also have reference to a specific point (or instant) of t. In this sense, the 
rate of growth is instantaneous. 

2. In the present case, however, the instantaneous rate of growth happens to be a constant 
r, with the rate of growth thus remaining uniform at all points of time. This may not, of 
course, be true of all growth situations actually encountered. 
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3. Even though the rate of growth r is measured at a particular point of time, its magnitude 
nevertheless has the connotation of so many percent per unit of time, say, per year (if t is 
measured in year units). Growth, by its very nature, can occur only over a time interval. 
This is why a single still picture (recording the situation at one instant) could never por¬ 
tray, say, the growth of a child, whereas two still pictures taken at different times—say, 
a year apart—can accomplish this. To say that ! / has a rate of growth of r at the instant 
t = to, therefore, really means that, if the rate of change dV/dt(=rV ) prevailing at 
t = to is allowed to continue undisturbed for one whole unit of time (1 year), then V will 
have grown by the amount rV at the end of the year. 

4. For the exponential function V — Ae r! , the percentage rate of growth is constant at all 
points of/, but the absolute amount of increment of V increases as time goes on, because 
the percentage rate will be calculated on larger and larger bases. 

Upon interpreting r as the instantaneous rate of growth, it is clear that little effort will 
henceforth be required to find the rate of growth of a natural exponential function of the 
form y ~ Ae rl , provided r is a constant. Given a function y = 75e u 02 ', for instance, we can 
immediately read off the rate of growth of y as 0.02 or 2 percent per period. 


Continuous versus Discrete Growth 

The preceding discussion, though analytically interesting, is still open to question insofar 
as economic relevance is concerned, because in actuality growth does not always take place 
on a continuous basis—not even in interest compounding. Fortunately, however, even for 
cases of discrete growth, where changes occur only once per period rather than from instant 
to instant, the continuous exponential growth function can be justifiably used. 

For one thing, in cases where the frequency of compounding is relatively high, though 
not infinite, the continuous pattern of growth may be regarded as an approximation to the 
true growth pattern, But, more importantly, we can show that a problem of discrete or 
discontinuous growth can always be transformed into an equivalent continuous version, 

Suppose that we have a geometric pattern of growth (say, the discrete compounding of 
interest) as shown by the following sequence: 

A,A(l+ilA(\+i) 2 ,A(l+i?,... 

where the effective interest rate per period is denoted by i and where the exponent of the 
expression (1 -hr) denotes the number of periods covered in the compounding. If we con¬ 
sider (1 + 1 ) to be the base b in an exponential expression, then the given sequence may be 
summarized by the exponential function Ah' —except that, because of the discrete nature 
of the problem, t is restricted to integer values only, Moreover, b = 1 + i is a positive num¬ 
ber (positive even if / is a negative interest rate, say, -0.04), so that it can always be 
expressed as a power of any real number greater than 1, including e. This means that there 
must exist a number r such that* 


1 + i = b = e r 


1 The method of finding the number f, given a specific value of b, will be discussed in Sec. 10.4. 
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Thus wc can transform Ab‘ into a natural exponential function: 

.4(1 + /)' = Ab' — Ae" 

For any given value of / -in this context, integer values of t —the function Ac" will, 
of course, yield exactly the same value as .4(1+0', such as A{\ + i) = Ae' and 
^(1 + /) 2 — Ae 1 '. Consequently, even though a discrete case A(] +/)' is being consid¬ 
ered, we may still work with the continuous natural exponential function Ae". This 
explains why natural exponential functions are extensively applied in economic analysis 
despite the fact that not all growth patterns may actually be continuous. 

Discounting and Negative Growth 

Let us now turn briefly from interest compounding to the closely related concept ol 
discounting. In a compound-interest problem, we seek to compute \ht future value V (prin¬ 
cipal plus interest) from a given present value A (initial principal). The problem of 
discounting is the opposite one of finding the present value + of a given sum F which is to 
be available t years from now. 

Let us take the discrete case first. If the amount of principal A will grow into the future 
value of .4(1 + ()' after t years of annual compounding at the interest rate i per annum, 
i.e., if 

V = A(\+i)‘ 

then, by dividing both sides of the equation by the nonzero expression (I + i )', we can get 
the discounting formula: 

A = -A— = v(i+rr‘ (10.10) 

(l +0' 

which involves a negative exponent. It should he realized that in this formula the roles of V 
and A have been reversed: V is now a given, whereas A is the unknown, to be computed 
from i (the rate of discount) and t (the number of years), as well as F. 

Similarly, for the continuous case, if the principal A will grow into Ae" after t years of 
continuous compounding at the rale r in accordance with the formula 

V = Ae" 

then we can derive the corresponding continuous-discounting formula simply by dividing 
both sides of the last equation by e rl : 

A = — = Ve" (10.11) 

e" 

Here again, wc have A (rather than F) as the unknown, to be computed from the given 
future value V, the nominal rate of discount r, and the number of years t, The expression 
e~" is often referred to as the discount factor. 

Taking (10.11) as an exponential growth function, wc can immediately read -r as the 
instantaneous rate of growth of A. Being negative, this rate is in effect a rate of decay. Just 
as interest compounding exemplifies the process of growth, discounting illustrates negative 
growth. 
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EXERCISE 10.2 

1. Use the infinite-series form of e* in (10.6) to find the approximate value of: 

(a) e 2 (b) +fe {= e 1 / 2 ) 

(Roundoff your calculation of each term to three decimal places, and continue with the 
series till you get a term 0.000.) 

2. Given the function <p(x) = e 2x : 

(a) Write the polynomial part P„ of its Waclaurin series. 

(b) Write the Lagrange form of the remainder R n , Determine whether R„->0 as 
n -*• cc, that is, whether the series is convergent to <p{x). 

(C) If convergent, so that 4>{x) may be expressed as an infinite series, write out this 
series. 

3. Write an exponential expression for the value: 

(a) $70, compounded continuously at the interest rate of 4% for 3 years 

(b) $690, compounded continuously at the interest rate of 5% for 2 years 
(These interest rates are nominal rates per annum.) 

4. What is the instantaneous rate of growth of y in each of the following? 

(o) y = f m (c) y = A<? At 

(b)y=15e° 03t (d)y= 0.03e‘ 

5. Show that the two functions y\= Ae rt (interest compounding) and yi = Ae~" 
(discounting) are mirror images of each other with reference to the y axis (cf. Exercise 
10.1-5, part(ft)]. 

10.3 Logarithms _ ___ 

Exponential functions are closely related to logarithmic functions {logJunctions, for short). 
Before we can discuss log functions, we must first understand the meaning of the term 
logarithm. 

The Meaning of Logarithm 

When we have two numbers such as 4 and 16, which can be related to each other by the 
equation 4 2 — 16, we define the exponent 2 to be the logarithm of 16 to the base of 4, and 
write 


log 4 16 = 2 

It should be clear from this example that the logarithm, is nothing but the power to which a 
base (4) must be raised to attain a particular number (16). In general, we may state that 

y = b f & f=logjy (10,12) 

which indicates that the log ofy to the base b (denoted by log ft y) is the power to which the 
base b must be raised in order to attain the value y. For this reason, it is correct, though 
tautological, to write 


b'^y =y 


268 Part Four Optimization Pmblems 


Given v, the process of finding its logarithm log* y is referred to as taking the log ofy to the 
base b. The reverse process, that of finding y from a known value of its logarithm log* y, is 
referred to as taking the antilog of log* y 

in the discussion of exponential functions, we emphasized that the function y = b ! 
(with b > 1) is strictly increasing. This means that, for any positive value of>', there is a 
unique exponent t (not necessarily positive) such that y = b‘\ moreover, the larger the value 
ofy, the larger must be t, as can be seen from Fig. 10.2. Translated into logarithms, the strict 
monotonicity of the exponential function implies that any positive number >- must possess 
a unique logarithm / to a base b > 1 such that the larger the y, the larger its logarithm. 
As Figs. 10.1 and 10.2 show, y is necessarily positive in the exponential function y = b‘\ 
consequently, a negative number or zero cannot possess a logarithm. 


Common Log and Natural Log 

The base of the logarithm, b > 1, does not have to be restricted to any particular number, 
but in actual log applications two numbers are widely chosen as bases—the number 10 and 
the number e. When 10 is the base, the logarithm is known as the common logarithm, sym¬ 
bolized by log 1() (or if the context is clear, simply by log). With e as the base, on the other 
hand, the logarithm is referred to as the natural logarithm and is denoted either by log t or 
by In (for natural log). We may also use the symbol log (without subscript e ) if it is not 
ambiguous in the particular context. 

Common logarithms, used frequently in computational work, are exemplified by the 
following: 


logjy 1,000= 3 
log l0 100 — 2 
logm 10 =1 

log H , 1 = 0 

login 0.1 =“! 

logioO.01 =-2 


[because IF = 1,000] 
[because 10 2 = 100] 
[because 10 1 = 10] 
[because 10° = 1] 
[because 10 - ' = 0.1] 
[because ID -2 = 0.01] 


Observe the close relation between the set of numbers immediately to the left of the equals 
signs and the set of numbers immediately to the right. From these, it should be apparent that 
the common logarithm of a number between 10 and 100 must be between 1 and 2 and that 
the common logarithm of a number between 1 and 10 must be a positive fraction, etc. The 
exact logarithms can easily be obtained from a table of common logarithms or electronic 
calculators with log capabilities. 1 

In analytical work, however, natural logarithms prove vastly more convenient to use 
than common logarithms. Since, by the definition of logarithm, we have the relationship 

y = e 1 f = log e y (orf = lny) (10.13) 

it is easy to see that the analytical convenience of e in exponential functions will automat¬ 
ically extend into the realm of logarithms with e as the base. 


f More fundamentally, the value of a logarithm, like the value of e, can be calculated (or 
approximated) by resorting to a Madaurin expansion of a log function, in a manner similar to that 
outlined in (10.6). However, we shall not venture into this derivation here. 
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The following examples will serve to illustrate natural logarithms: 

In e 3 = log t , e i = 3 
In f 2 = log,, e 2 = 2 
lnt? 1 = log^ 1 = I 
In 1 = log e e u = 0 

In - = log (i £ 5_1 = -1 

The general principle emerging from these examples is that, given an expression e k , where 
k is any real number, we can automatically read the exponent k as the natural log ofV. In 
general, therefore, we have the result that lnt?* = k) 

The common log and natural log are convertible into each other; i.e., the base of a loga¬ 
rithm can be changed, just as the base of an exponential expression can. A pair of conver¬ 
sion formulas will be developed after wc have studied the basic rules of logarithms. 

Rules of Logarithms 

Logarithms are in the nature of exponents; therefore, they obey certain rules closely related 
to the rules of exponents introduced in Sec. 2.5. These can be of great help in simplifying 
mathematical operations. The first three rules are stated only in terms of the natural log, but 
they arc also valid when the symbol In is replaced by log A . 

Rule I (log of a product) ln(«u) = In u + Ini 1 (u, v > 0) 


Example 1 


Example 2 

PROOt By definition, In a is the power to which e must be raised to attain the value of a; 
thus f' lnu = u} Similarly, we have c 1 "' 1 = v and = uv. The latter is an exponential 
expression for uv. However, another expression of uv is obtainable by direct multiplication 
oftcandu: 


in(re 4 ) = In e 6 + lne 4 = 6 +4 = 10 
ln(,4e 7 ) = In A + In e 7 - In A + 7 


Thus, by equating the two expressions for uv above, we find 

c irt",) =e m^in„ andhencc | n(uy) _ in „ + ]!,„ 

Rule II (log of a quotient) In(u/tj) = Inu - In v (u, v > 0) 

1 As a mnemonic device, observe that when the symbol In (or log f ) is placed at the left of the 
expression e\ the symbol In seems to cancel out the symbol e, leaving k as the answer. 

* Note that when eis raised to the power In u, the symbol eand the symbol In again seem to cancel 
out, leaving u as the answer. 
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Example 3 
Example 4 


Example 5 
Example 6 


Example 7 
Example 8 


Infefyc) = Ine 2 - Inc = 2 - In c 
InCe 2 /? 5 ) = In e 2 - Inc 5 = 2 - 5 = -3 


The proof of this rule is very similar to that of Rule 1 and is therefore left to you as an 
exercise. 

Rule Ill (log of a power) Inu^ssalnw (u > 0) 


lne 1s = 15 lne= 15 
In A 3 = 31n/* 

Proof By definition, e h “ = «; and similarly, e lll "“' = u a . However, another expression for 
u u can be formed as follows: 

u a = («>“><'= e a, "“ 

By equating the exponents in the two expressions for n", we obtain the desired result, 
In u a = a In u. 

These three rules are useful devi ces for simplifying the mathematical operations in certain 
types of problems. Rule I serves to convert, via logarithms, a multiplicative operation (m>) 
into an additive one (In u + In v); Rule IT turns a division (w/u) into a subtraction (In u - In i>); 
and Rule III enables us to reduce a power to a multiplicative constant. Moreover, these rules 
can be used in combination. Also, they can be read backward, and applied in reverse. 

In(uv°) = In u + In v“ = In u +- a In v 

lnu+ olnv= Inu + ln v 3 = ln(uv°) [Example 7 in reverse] 

You are warned, however, that when we have additive expressions to begin with, loga¬ 
rithms may be of no help at all. In particular, it should be remembered that 

ln(n ± v) ^ lnw ±lnu 

Let us now introduce two additional rules concerned with changes in the base of a 
logarithm. 

Rule IV (conversion of log base) log;, u = (k>g / ,i?)(log <? K) (u > 0) 

This rule, which resembles the chain rule in spirit (witness the “chain” b / e \ e / u ), 
enables us to derive a logarithm log,, u (to base e) from the logarithm log;, u (to base b). or 
vice versa. 

Proof Let u = e p , so that p = log t . u. Then it follows that 

log;, u = log* e p = p log,, e - (log,, «)(lofo e) 

Rule IV can readily be generalized to 

log* u — (log* c)(log,. w) 


where c is some base other than b. 
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Rule V (inversion of log base) log* t* =- 

log c ,b 

This rule, which resembles the inverse-function rule of differentiation, enables us to 
obtain the log of b to the base e immediately upon being given the log of e to the base b, 
and vice versa. (This rule can also be generalized to the form ]og h c = 1/ log,*.) 

Proof As an application of Rule iy let u = b; then wo have 

1°& h = (log/, t')(Iog ( , h) 

But the left-side expression is log/, b = 1; therefore log/, e and log, b must be reciprocal to 
each other, as Rule V asserts. 


From the last two rules, it is easy to derive the following pair of conversion formulas 
between common log and natural log: 

logio N = (logioOOogr = °- 4343 !°g, N 

log, N = (log, 10)(log in N) = 2.30261og| () A/ { ' > 

for N a positive real number. The first equals sign in each formula is easily justified by 
Rule IV In the first formula, the value 0.4343 (the common log of 2.71828) can be found 
from a table of common logarithms or an electronic calculator; in the second, the value 
2.3026 (the natural log of 10) is merely the reciprocal of 0.4343, so calculated because of 
Rule V. 


Example 9 log f 100 = 2.3026(log )0 100) = 2.3026(2) = 4.6052. Conversely, we have log 10 100 = 
---O.4343(log, 100) = 0.4343(4.6052) = 2. 


An Application 

The preceding rules of logarithms enable us to solve with case certain simple exponential 
equations (exponential////ic7/o/is set equal to zero). For instance, if we seek to find the value 
of* that satisfies the equation 

ab x - c = 0 (a,b,c > 0) 

we can first try to transform this exponential equation, by the use of logarithms, into a 
linear equation and then solve it as such. For this purpose, the c term should first be trans¬ 
posed to the right side: 

ab x =c 

This is because there is no simple log expression for the additive expression [ab x - c), but 
there do exist convenient log expressions for the multiplicative term ah x and for c individ¬ 
ually. Thus, after the transposition of c and upon taking the log (say, to base 10) of both 
sides, we have 

logo +* log b = log c 

which is a linear equation in the variable x, with the solution 

logc - log a 
log b 
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EXERCISE 10.3 

J. What are the values of the following logarithms? 


(a) log 10 10,000 

(b) log 10 0.0001 

2. Evaluate the following: 
(o) In e 7 
(b) log e r 4 


(c) log 3 81 

(d) log, 3,125 

(c) Ml/e 3 ) 


(e) (<^ 3 )! 

(of) log e (l/e 2 ) (f) Ine'- e ln * 

3. Evaluate the following by application of the rules of logarithms: 

(a) log, 0 (100 ) 13 (c) ln(3/8) (e)lnA 8 e 4 

(b) log 10 ygg (d) In Ae 2 (0 (log 4 e)(log e 64) 

4. Which of the following are valid? 

u uv 

(a) In u - 2 = In -7 (c) In u + In v - In w = In — 


w 


(b) 3 + In v s In 


(d) In 3 -fin 5 = In 8 


5. Prove that lri(u/v) = In c/-In v. 


10.4 Logarithmic Functions _ 

When a variable is expressed as a function of the logarithm of another variable, the func¬ 
tion is referred to as a logarithmic function. We have already seen two versions of this type 
of function in (10.12) anil (10.13). namely, 

t = log,, v and t = log,, y (= In v) 

which differ from each other only in regard to the base of the logarithm. 

Log Functions and Exponential Functions 

As we stated earlier, log functions are inverse functions of certain exponential functions. 
An examination of the previous two log functions will confirm that they are indeed the 
respective inverse functions of the exponential functions 

y = b' and = e' 

because the log functions cited are the results of reversing the roles of the dependent and 
independent variables of (he corresponding exponential functions. You should realize, of 
course, that the symbol t is being used here as a general symbol, and it does not necessar¬ 
ily stand for time. Even when it docs, its appearance as a dependent variable docs not mean 
that time is determined by some variable y; it means only that a given value ofy is associ¬ 
ated with a unique point of lime. 

As inverse functions of strictly increasing (exponential) functions, logarithmic functions 
must also be strictly increasing, which is consistent with our earlier statement that the 
larger a number, the larger is its logarithm to any given base. This property may be 



Chaf>ter 10 Exponential an d Loga rith mic Functions 271 


expressed symbolically in terms of the following two propositions; For two positive values 
ofy(yi, and 


lnyi=lny 2 44- )’\ = yi 

lnv! > In >- 2 44 >'i > >2 

These propositions are also valid, of course, if we replace In by log,,. 


(10.15) 


The Graphical Form 

The monotonicity and other general propenies of logarithmic functions can be dearly ob¬ 
served from their graphs. Given the graph of the exponential function y = e‘. we can ob¬ 
tain the graph of the corresponding log function by replotting the original graph with the 
two axes transposed. The result of such replotting is illustrated in Fig. 10.3. Note that if the 
graph of Fig. 10.3b were laid over the graph of Fig. 10.3a, with v axis on y axis and taxis 
on ! axis, the two curves should coincide exactly. As they actual ly appear in Fig. 10.3 with 
interchanged axes—on the other hand, the two curves are seen to be mirror images of each 
other (as the graphs of any pair of inverse functions must be) with reference to the 45° line 
drawn through the origin. 

This mirror-image relationship has several noteworlhy implications. For one. although 
both are strictly increasing, the log curve shows y increasing at a decreasing rate (second 
derivative negative), in contradistinction to the exponential curve, which shows y increas¬ 
ing at an increasing rate. Another interesting contrast is that, while the exponential function 
has a positive range, the log function has a positive domain instead. (This latter restriction 


FIGURE 10.3 
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on the domain of the log function is, of course, merely another way of stating that only 
positive numbers possess logarithms.) A third consequence of the mirror-imago relation¬ 
ship is that, just as y = e‘ has a vertical intercept at 1, the log function t — log t , y must 
cross the horizontal axis at y = 1, indicating that log, 1 = 0. Inasmuch as this horizontal 
intercept is unaffected by the base of the logarithm—for instance, log lv 1 = 0. loo—we 
may infer from the general shape of the log curve in Fig. 10.3* that, for any base, 


0 < v < 1 
v = 1 
y>l 




log v < 0 
log y = 0 
log y > 0 


( 10 . 16 ) 


For verification, wc can check the two sets of examples of common and natural logarithms 
given in Sec. 10,3. Furthermore, wc may note that 


logj'-* _®J asv-{” 00.161 

The graphical comparison of the logarithmic function and the exponential function in 
Fig. 10.3 is based on the simple functions y - e l and / - In y. The same general result will 
prevail if we compare the generalized exponential function y = Ae rl with its correspond¬ 
ing log function. With the (positive) constants A and r to compress or extend the exponen¬ 
tial curve, it will nevertheless resemble the general shape of Fig. 10.3a, except that its ver¬ 
tical intercept will be at y = A rather than at y = 1 (when r = 0, we have y = Ae° = A). 
Its inverse function, accordingly, must have a horizontal intercept at y = A. In general, 
with reference to the 45° line, the corresponding log curve will be a mirror image of the 
exponential curve. 

If the specific algebraic expression of the inverse ol" y - Ae H is desired, it can be 
obtained by taking the natural log of both sides of this exponential function [which, 
according to the first proposition in (10.15), will leave the equation undisturbed] and then 
solving for i: 

In v = In (Ae n ) = In A -+ ri Inc = In A + rt 


Hence, 


Iny - \nA 

r 


[r * 0) 


(10.17) 


This result, a log function, constitutes the inverse of'the exponential function y = Ae n . As 
claimed earlier, the function in (10.17) has a horizontal intercept aty = A , because when 
y — A, wc have In v = In A, and therefore t — 0. 


Base Conversion 

In See. 10.2, it was stated that the exponential function y = Ah' can always be converted 
into a natural exponential function y = Ae y '. We are now ready to derive a conversion for¬ 
mula. Instead of Ah', however, let us consider the conversion of the more general expres¬ 
sion Ab cl into Ae''. Since the essence of the problem is to find an r from given values of b 
and c such that 


e r = b r 
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all that is necessary is to express r as a function of/; and c. Such a task is easily accom¬ 
plished by taking the natural log of both sides of the last equation: 

lne J = In//' 

The left side can immediately be read as equal to r, so that the desired function (conversion 
formula) emerges as 

r = In b c = dab (10.18) 

This indicates that the function y = Ah 1 ' 1 can always be rewritten in the natural-base form, 

y = At* chb)l . 

Example 1 Convert y= 2' to a natural exponential function. Here, we have A = \ , b = 2, and c= 1. 

- Hence r = c In b = In 2, and the desired exponential function is 

y— fett _ ^In 2)1 

If we like, we can also calculate the numerical value of In 2 by use of (10.14) and a table of 
common logarithms as follows: 

In 2 = 2.3026 log 10 2 = 2.3026(0.3010) = 0.6931 (10.T9) 

Then we may express the earlier result alternatively as y = <? 06931f . 

Example 2 Convert y = 3(5) 21 tg a natural exponential function. In this example, A = 3, b = 5, and 

- c= 2, and formula (10.18) gives us r = 2 In 5. Therefore the desired function is 

y=y^e ^^ = 3e (2ln5 > , 

Again, if we like, we can calculate that 

2 In 5 = In 25 = 2.3026iog 10 25 = 2.3026(1.3979) = 3.2188 
so the earlier result can be alternatively expressed as y = 3e 3,2188t , 


It is also possible, of course, to convert log functions of the form / = log^y into equiv¬ 
alent natural log functions. To that end, it is sufficient to apply Rule IV of logarithms, which 
may bc expressed as 

lng*T = (log^t'Xlogr.v) 

The direct substitution of this result into the given log function immediately gives us the 
desired natural log function: 


f = | og 6 ^ = (logj<r)(log P y) 

= j-- log^y [by Rule V of logarithms] 

In v 
In h 


By the same procedure, we can transform the more general log function i = a log /> ( cv) into 
the equivalent form 


a 

log, b 




a 

Jnb 


ln(cy) 


t = tftlog/, e)(log t , cy) = 
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Example 3 Convert the function t = log 2 y into the natural log form. Since in this example we have 
- 6 = 2 and a = c = 1 , the desired function is 


By (10.19), however, we may also express it as t = (1/0.6931) In y. 


Example 4 Convert the function t = 7 log 10 (2y) into a natural logarithmic function. The values of the 
- constants are in this case a =7, 6=10, and c= 2; consequently, the desired function is 

But since In 10 — 2.3026, as (10.14) indicates, this function can be rewritten as t = 
(7/2.3026) ln(2y) = 3.0400 ln<2 y). 


In the preceding discussion, we have followed the practice of expressing t as a function 
of y when the function is logarithmic. The only reason for doing so is our desire to stress 
the inverse-function relationship between the exponential and logarithmic functions. When 
a log function is studied by itself we shail write y = In/ (rather than t = In v), as is 
customary. Naturally, nothing in the analytical aspect of the discussion will be affected 
by such an interchange of symbols. 


EXERCISE 10.4 

1. The form of the inverse function of y= Ae n in (10.17) requires rto be nonzero. What 
is the meaning of this requirement when viewed in reference to the original exponen¬ 
tial function y - Ae n ? 

2. (o) Sketch a graph of the exponential function y= Ae rt ; indicate the value of the 

vertical intercept. 

( 6 ) Then sketch the graph of the log function t *= ———, and indicate the value of 
the horizontal intercept. r 

3. Find the inverse function of y - atf c . 

A. Transform the following functions to their natural exponential forms: 

(a) / = (c) y = 5(5) f 

(b) y = 2(7) 2t (d) y = 2(15)^ 

5. Transform the following functions to their natural logarithmic forms: 

(а) t = log 7 y (c) f-3log 1 s (9y) 

( б ) t=log s (3y) (d) r = 2!og , 0 y 

6 . Find the continuous-compounding nominal interest rate per annum (/) that is equiva¬ 
lent to a discrete-compounding interest rate (/) of 

(a) 5 percent per annum, compounded annually. 

(b) 5 percent per annum, compounded semiannually. 

(c) 6 percent per annum, compounded semiannually. 

(d) 6 percent per annum, compounded quarterly. 
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7. (o) In describing Fig. 10.3, the text states that, if the two curves are laid over each 
other, they show a mirror-image relationship. Where is the "mirror" located? 

(b) If we plot a function f(x) and its negative, - f(x), in the same diagram, will the two 
curves display a mirror-image relationship, too? If so, where is the "mirror" located 
in this case? 


(c) If we plot the graphs of Ae rl and /te ' rf in the same diagram, will the two curves be 
mirror images of each other? If so, where lsthe "mirror" located? 


10.5 Derivatives of Exponential and Logarithmic Functions 

Earlier it was claimed that the function e' is its own derivative. As it turns out, the natural 
log function. Inf, possesses a rather convenient derivative also, namely, d(\nt)/di = 1/7, 
This fact reinforces our preference for the base <?. Let us now prove the validity of these two 
derivative formulas, and then we shall deduce the derivative formulas for certain variants 
of the exponential and log expressions e‘ and In t. 


Log-Function Rule 

The derivative of the log function y = In t is 



1 

t 


To prove this, we recall that, by definition, the derivative of y = f{t) - In/ has the 
following value at t = N (assuming r -» N + )\ 


= lim 

( 


t-N 


lim 


Inf - InN 
t-N 


In (//AM 

= lim - [by Rule U of logarithms] 

r-.v t-N 

Now let us introduce a shorthand symbol m = -. Then we can write- = — and 

t t-N 1 1 “ N l - N N 

also — = I + —— = 1 + —, Thus the expression to the right of the limit sign in the 
N N m 

previous equation can be converted to the form 



, t m 

In — = — 
N N 


In 



= — In (1 + - 
N l m 


[by Rule Ill of logarithms] 


Note that, as / —* N + , m tends to infinity. Thus, to find the desired derivative value, wc may 
take the limit of the last expression in the preceding equation as m oo: 

If ]\ m ] i 

1>'<N)= I™ -In M + - = -lne = - [by (10.5)] 

»'->■•* N \ m / N N 

Since N can be any number for which a logarithm is defined, however, we can generalize 
this result, and write f’(t) - d(\nt)/dl = 1//. This proves the log-function rule for 
t -* N + . 
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The case of / -» N needs some modifications, but the essence of the proof is similar. 
Now the derivative of y = In / has the value 


i/(N) - 


lim 

-> A" 


t-N 


lim 


jf(N) - ip(t) 
N - t 


= lim 

r-*'V- 


lnA-ln/ 

N-t 


lim 

■*.v- 


ln(A//) 

N-t 


Let ix = t/(N - /); then 1 /(jV - /) = n/t, and Njt = 1 + (/V - t)/t = 1 + 1 hi. These 
equations enable us to rewrite the expression to the right of the last limit sign in the 
preceding equation lor f'(N) as 


1 i /V /, , 1 

—— In — = - In 1 + - 
N -t t l \ /i 




As t —> N ", ix —* oo. Thus the desired derivative value is 


f{N) 


lim — Ine = — 

/-».v- N 


the same result as for the case of i -*■ N + . This completes the proof of the log-function 
rule. Notice, once more, that in the proof process, no specific numerical values are 
employed, and the result is therefore generally applicabl e. 


Exponential-Function Rule 

The derivative of the function y = e' is 

d , , 

—e = e 
dt 

This result follows easily from the log-function rule. We know that the inverse function of 
the function y = e' is t = lny, with derivative dt/dy = 1/y.Thus, by the inverse-function 
rule, we may write immediately 

£ , _ dy _ _ 1 _ _ j, 

dt e dt dt/dy 1/y y 


The Rules Generalized 


The log-function and exponential-function rules can be generalized to cases where 
the variable t in the expressions' and In / is replaced by some function of/, say, /(/). The 
generalized versions of the two rules are 


dt 


■W = /'(,)*'» 


d , cit\ A/) 
* n/ 


or 




dt 


dt 


d 1 civ 
or — 111 i; = - —- 
dt v dt 


( 10 . 20 ) 


The proofs for (10.20) involve nothing more than the straightforward application of the 
chain rule. Given a function y = e f{, \ we can first let u = /(/), so that y = e". Then, by 
the chain rule, the derivative emerges as 


dt dt 


de u du 
du dt 


= e 


du_ 

dt 
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Example 1 


Example 2 


Example 3 


Example 4 


Example 5 


Similarly, given a function y = In/(f), we can first let v = fit), so as to form a chain: 
>’ = Inn, where v = fit). Then, by the chain rule, we have 


d_ 

cit 


In fit) = 


— lnw = 
dt 


dinvdv 
dv dt 


\_dv 
v di 


1 

W) 


fit) 


Note that the only real modification introduced in (10.20) beyond the simpler rules 
de'/dt = e' and d)\nt)/dt = i/t is the multiplicative factor /'(f). 


Find the derivative of the function y=e n . Here, the exponent is rt = f(t), with f(f) = r; 
thus 


dy 

dt 



— re 


ft 


Find dy/dtirom the function y= e In this case, f(f) 


dy 

dt 



t, so that ff 0 = -1. As a result, 


Find dy/dt from the function y= In(at)- Since in this case f(t) = at, with f'(t) = a, the 
derivative is 


d 

dt 


ln(o£) 


a 

at 


1 

7 


which is, interestingly enough, identical with the derivative of y = In t. 

This example illustrates the fact that a multiplicative constant for f within a log expres¬ 
sion drops out In the process of derivation. But note that, for a constant k, we have 


thus a multiplicative constant outside the log expression is still retained in derivation. 


Find the derivative of the function y = In f. With f(t) = t c and f'(t) = ct c_l , the formula in 
(10,20) yields 


Find dy/dt from y = t 3 In f 2 . Because this function is a product of two terms f 3 and In t 2 , the 
product rule should be used: 


dy 

dt 


(In t 2 )<3f 3 ) 

= 2t 2 + 3^(2 In t) [Rule III of logarithms] 
= 2^(1 + 3 In t) 
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Example 6 


The Case of Base b 

For exponential and log functions with base ft, the derivatives are 

Warning: — b 1 ^ /ft ' -1 
dt 


—ft' = ft' In ^ 
dt 


d 

dt 


log* i = 


1 

t\nb 


( 10 . 21 ) 


Note that in the special case of base e (when b — e), we have In b = In e = 1, so that these 
two derivatives reduce to the basic exponential-function rule {d/di)e‘ = <f and the basic 
log-function rule(c//dt) In/ = 1 //, respectively. 

The proofs for (10.21) are not difficult. For the case of ft'. the proof is based on the 
identity b = e jn, \ which enables us to write 

_ e l\nl>)i _ e t\nb 


(We write t In b, instead of In bt, in order to emphasize that t. is not a part of the log 
expression.) Hence 


— b r = -e 1 '" 1, = (\nb){e‘ lnh ) [by (10.20)] 
dt dt 

= (lnft)(ft') = b'\Tib 

To prove the second part of {10.21), on the other hand, we rely on the basic log property that 

log i / = (log fi f)(log e /) = -j^lni 

which leads us to the derivative 

d d ( \ \ i d 4 I (\\ 

The more general versions of these two formulas are 

—h f{i) = f\t)b m In b 

dt (10.2V) 

dt ^ fit) 111 b 

Again, it is seen that if b -e, then In b - 1, and these formulas reduce to (10.20). 

Find the derivative of the function y = 12 1 f . Here, b = 12, f(t) = 1 -1, and f (f) = -1; 
thus 

— = -(12) 1 ' In 12 

dt 


Higher Derivatives 

Higher derivatives of exponential and log functions, like those of other types ol 1 unctions, 
are merely the results of repeated differentiation. 
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Example 7 


Example 8 


Example 9 


Find the second derivative of y = kf (with b> 1). The first derivative, by (10.21), is 
y'(t) = y In b (where In b is, of course, a constant); thus, by differentiating once more with 
respect to t, we have 

= ]nb=( ' b ’ lnfa ) \nb = b'(\nb) 2 

Note that y = b l is always positive and In b (for b> 1) is also positive [by (10.16)]; thus 
y'{t) = b { In b must be positive. And y"(f), being a product of if and a squared number, is 
also positive. These facts confirm our previous assertion that the exponential function y = b ! 
increases monotonically at an increasing rate. 


Find the second derivative of y = In t. The first derivative is y' = 1/t = f 1 ; hence, the sec¬ 
ond derivative is 


Inasmuch as the domain of this function consists of the open interval (0, oo), y* = 1/t must 
be a positive number On the other hand, y" is always negative. Together, these conclusions 
serve to confirm our earlier assertion that the log function y = In t increases monotonically 
at a decreasing rate. 

An Application 

One of the prime virtues of the logarithm is its ability to convert a multiplication into an 
addition, and a division into a subtraction. This property can be exploited when wc arc 
differentiating a complicated product or quotient of any type of functions (not necessarily 
exponential or logarithmic). 


Find dy/dx from 


x 2 

(x + 3)(2x +1) 


Instead of applying the product and quotient rules, we may first take the natural log of both 
sides of the equation to reduce the function to the form 

In y = Inx 2 - ln(x + 3) ln(2x + 1) 


According to (10.20), the derivative of the left side with respect to x is 


(left side) 
dx 


1 dy 
ydx 


whereas the right side gives 


dx 


(right side) 


2x 

1? 


1 

x+~3 


2 

2 x + 1 


7x + 6 

x(x + 3)(2x + 1) 


When the two results are equated and both sides are multiplied by y, we get the desired 
derivative as follows: 


dy 7x + 6 

dx = x{x + 3)(2x + 1) ^ 

7x + 6 x 2 x(7x + 6) 

~ x(x + 3)(2x + 1) (x + 3)(2x + 1) = (x 4- 3) 2 (2x +1) 2 
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Example 10 Find ,rom y = x ° elcr ~ C - Takin 9 the natural log of both sides, we have 

In y = a In x 4- In e k *~ c =* 0 In x -t- Jtx - c 
Differentiating both sides with respect to x, and using (10.20), we then get 

! ? = -+* 
ydx x 


and 


— = (- + k) y = (- + k) xV* i 
dx \x > \x / 


Note, however, that if the given function contains additive terms, then it may not be de¬ 
sirable to convert the function into the log form. 


EXERCISE 10.5 


1. Find the derivatives of: 

(a) y = e 2t+4 (d) y = 5P 3- ' 2 (g) y = x 2 e 2> 

(b) y = e 1 " 9t (e) y = e °* 2+b *~ c (h) y = axe bx+c 

(c) y^/' 1 (0 y - xe* 

2. (a) Verify the derivative in Example 3 by utilizing the equation In (of) = In a + In t. 
(b) Verify the result in Example 4 by utilizing the equation In t 4 = cln t. 

3. Find the derivatives of: ( 2x \ 

(a) y = ln(7t s ) (d) y = 5 ln(f + l) 2 (9)^ = ln (^J 

(b) y= In (flf f ) (?) y = In x - In (1 + x) (h) y = 5x 4 In x 2 

(c) y = ln(f +19) (f) y = ln[x(1 -x) 8 ] 


4. Find the derivatives of: 

(o) y = 5' (c) y = 13 2M (?) y = log 2 (8x 2 + 3) 

(b) y - log 2 (t+ 1 ) (d) y - log 7 7x 2 {f)y=x 2 log 3 x 

5. Prove the two formulas in (10.21'). 

6 . Show that the function V =* Ae n (with A,r > 0) and the function A = Ve~ rt (with 
V, r > 0) are both strictly monotonic, but in opposite directions, and that they are both 
strictly convex in shape (cf. Exercise 10.2-5), 

7. Find the derivatives of the following by first taking the natural log of both sides: 


10.6 Optimal Timing ___ 

What we have learned about exponential and log functions can now be applied to some 
simple problems of optimal timing. 

A Problem of Wine Storage 

Suppose that a certain wine dealer is in possession of a particular quantity (say, a case) of 
wine, which he can either sell at the present time (t = 0 ) for a sum of S K or else store for 
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some length of time and then sell at a higher value. The growing value (V) of the wine is 
known to be the following function of time: 

V^Ke^' [=Kexp{/' /3 )] ( 10 . 22 ) 

so that if t = 0 (sell now), then V = K. The problem is to ascertain when he should sell it 
in order to maximize profit, assuming the storage cost to be nil.’*’ 

Since the cost of wine is a “sunk” cost—the wine is already paid for by the dealer - and 
since storage cost is assumed to be nonexistent, to maximize profit is the same as maxi¬ 
mizing the sales revenue, or the value of V. There is one catch, however. Finch value of V 
corresponding lo a specific point of t represents a dollar sum receivable at a different date 
and, because of the interest dement involved, is not directly comparable with the V value 
of another date. The way out of this difficulty is to discount each V figure to its present- 
value equivalent (the value at time t = 0), for then all the V values will be on a comparable 
footing. 

Let us assume that the interest rate on the continuous-compounding basis is at the level 
of r. Then, according to (10.11), the present value of V can be expressed as 

A(t) = Ve~" = A>'V" = rl (10.22') 

whore A , denoting the present value of V, is itself a function of t. Therefore our problem 
amounts to finding the value of ! that maximizes A. 


Maximization Conditions 

The first-order condition for maximizing A is to have dA fdi - U. To find this derivative, 
we can either differentiate {10.22') directly with respect to U or do it indirectly by first 
taking the natural log of both sides of (10.22'') and then differentiating with respect to r. Let 
us illustrate the latter procedure. 

First, we obtain from (10.22') the equation 

In A(t) = In K + In = In K+ (t xn - rt) 

Upon differentiating both sides with respect lo f, we then get 

1 dA 1 
~A~dt ~ 2 


— = ,r ,/2 -r 


or 


r 

dt \ 2 


Since A t^O, the condition dAjdt = 0 can be satisfied if and only if 


1 - 1/2 
2 l ~ r 


or 


277 




or 


2 r 


= f 


This implies that the optimum length of storage time is 

'IV i 

2r I ~ 4r 2 


r = 


f The consideration of storage cost will entail a difficulty we are not yet equipped to handle. Later, in 
Chap. 14, we shall return to this problem. 
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If r = 0.10, for instance, ihen t* = 25, and the dealer should store the case of wine for 
25 years. Note that the higher the rate of interest (rate of discount) is. the shorter the 
optimum storage period will be. 

The first-order condition, 1/(2* ft) = r, admits of an easy economic interpretation. The 
left-hand expression merely represents the rate of growth of wine value V. because from 
( 10 . 22 ) 


= f-K exp(; lj2 ) = exp(; l/2 ) 

at at 

[K constant] 

= K (^ r ' /2 ) exp(fl/2 ) 

[by (10.20)] 

11 

k>7 — 

i 

Tu 

'n 

[by (10.22)] 


so that the rate of growth of Fis indeed the left-hand expression in the first-order condition: 

dV jdl 1 1 

' V = “T" ~1 1 ' ~i7t 

The right-hand expression r is, in contrast, the rate of interest or the rate of compound- 
interest growth of the eash fund receivable // the wine is sold right away—an opportunity- 
cost aspect of storing the wine. Thus, the equating of the two instantaneous rates, as 
illustrated in Fig. 10.4, is an attempt to hold onto the wine until the advantage of storage 
is completely wiped out, i.e., to wait till the moment when the (declining) rate of growth of 
wine value is just matched by the (constant) interest rate on cash sales receipts. 

The next order of business is to cheek whether the value of f satisfies the second-order 
condition for maximization of.T. The second derivative oiA is 


d 2 A d 

lW~Jt 


1 

2 


- 1/2 





dA 

It 


FIGURE 10.4 
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But, since the final term drops out when we evaluate it at the equilibrium (optimum) point, 
where dA/dt — 0, we are left with 


d 2 A d 
dt 2 A dl 



= A 


1 

~a' 


-A 

aV? 


In view that A > 0, this second derivative is negative when evaluated at t* > 0, thereby 
ensuring that the solution value r* is indeed profit-maximizing. 


A Problem of Timber Cutting 

A similar problem, which involves a choice of the best time to take action, is that of timber 
cutting. 

Suppose the value of timber (already planted on some given land) is the following 
increasing function of time: 

V = 2' r ' 


expressed in units of S1,000. Assuming a discount rate of r {on the continuous basis) and 
also assuming zero upkeep cost during the period of limber growth, what is the optimal 
time to cut the limber for sale? 

As in the wine problem, wc should first convert V into its present value: 

A(t)=Ve~ rl = 2'V" 

thus In A = In 2^+ In e n = *J~t In 2 - rt = t 1/2 In 2 -ri 


To maximize A, wc must set dA/di = 0. The first derivative is obtainable by differentiating 
In A with respect to t and then multiplying by A : 


1 dA 
~A~dt 


^ -r ,/2 ln2-r 


thus 


dA { ln2 \ 

* =A \ui-'') 


Since A 0, the condition dA /dt = 0 can be met if and only if 


In 2 

—- = r or 

2 4t 


In 2 

IF 


Consequently, the optimum number of years of growth is 




\ 2 r / 


It is evident from this solution that, the higher the rate of discount, the earlier (he timber 
should be cut. 

To make sure that t* is a maximizing (instead of minimizing) solution, the second-order 
condition should be checked. But this will be left to you as an exercise. 

In this example, we have abstracted from planting cost by assuming that the trees are 
already planted, in which case the (sunk) planting cost is legitimately excludable from con¬ 
sideration in the optimization decision. If the decision is not one of when to harvest but one 
of whether or not to plant at all, then the planting cost {incurred at the present) must be duly 
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compared with the present value of the timber output, computed with / set at the optimum 
value f *. For instance, if r = 0.05, then wc have 

= =(6 - 931)2 = 48 -° ycars 

and A* = - ([22-0222}e~ 2AO 

= 122.0222(0.0907) = $11.0674 (in thousands) 

So only y planting cost lower than A i will make the venture worthwhile—again, provided 
that upkeep cost is nil. 


EXERCISE 10.6 

1. If the value of wine grows according to the function V = Ke 2 '-' t , instead of as in 
(10.22), how long should the dealer store the wine? 

2. Check the second-order condition for the timber-cutting problem. 

3. As a generalization of the optimization problem illustrated in the present section, show 
that: 

(o) With any value function V = f(t) and a given continuous rate of discount r, the 
first-order condition for the present value A of V to reach a maximum is that the 
rate of growth of V be equal to r. 

(ft) The second-order sufficient condition for a maximum really amounts to the stipu¬ 
lation that the rate of growth of V be strictly decreasing with time. 

4. Analyze the comparative statics of the wine-storage problem. 


10.7 Further Applications of Exponential 

and Logarithmic Derivatives __ 

Aside from their use in optimization problems, the derivative formulas of See. 10.5 have 
further useful economic applications, 

Finding the Rate of Growth 

When a variable y is a function of time, y = /(?), its instantaneous rate of growth is 
defined as* 

dy/dt /'(/) marginal function 

r, --—--- (10.25) 

y /(/) total function 

But. from (10.20), we see that this ratio is precisely the derivative of In /(?) = In i . Thus, 
to find the instantaneous rate of growth of a function of time /(f), we can- instead of 
differentiating it with respect to /, and then dividing by /(Z)—simply take its natural log 


1 1f the variable f does not denote time, the expression (dy/dt)/y is referred to as the proportional rote 
of change of y with respect to t. 
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Example 1 


Example 2 


and then differentiate In /(t) with respect to time. 1 This alternative method may turn out to 
be the simpler approach, if f{t) is a multiplicative or divisional expression which, upon 
logarithm-taking, will reduce to a sum or difference of additive terms. 


Find the rate of growth of V = Ae n , where t denotes time. It is already known to us that the 
rate of growth of V is r, but let us check it by finding the derivative of In V: 

In V = In A + rt In e= In A + rt [A constant] 


Therefore, 


rv 


d d 

jt' nV= 0 + jt rt 


r 


as was to be demonstrated. 


Find the rate of growth of y = 4‘. In this case, we have 

In y = |n4 f = t In 4 

Hence ~ In y = In 4 

This is as it should be, because = 4, and consequently, y = 4' can be rewritten as 
y = e (ln4 / which would immediately enable us to read (In 4) as the rate of growth of>\ 

Rate of Growth of a Combination of Functions 

To carry this discussion a step further, let us examine the instantaneous rate of growth of a 
product of two functions of time: 


y = uv 

Taking the natural log ofy, we obtain 


where 


« = /(') 
V = g(t) 


In y = In u + In v 


Thus the desired rate of growth is 



— In a H-In ri 

dt dt 


But the two terms on the right side are the rates of growth of u and v, respectively. Thus we 
have the rule 


r luv) — r u 


+ r. 


(10.24) 


Expressed in words, the instantaneous rate of growth of &product is the sum of the instan¬ 
taneous rates of growth of the components. 

By a similar procedure, the rate of growth of a quotient can be shown to be the differ¬ 
ence between the rates of growth of the components (see Exercise 10.7-4): 


r { „M - r„ - r. 


(10.25) 


f If we plot the natural log of a function f (r) against t in a two-dimensional diagram, the slope of the 
curve, accordingly, will tell us the rate of growth of f(t). This provides the rationale for the so-called 
semilog scale charts, which are used for comparing the rates of growth of different variables, or the 
rates of growth of the same variable in different countries. 
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Example 3 


Example 4 


If consumption C is growing at the rate a, and if population H (for "heads") is growing 
at the rate p, what is the rate of growth of per capita consumption? Since per capita 
consumption is equal to C/H, its rate of growth should be 


r(C;H) = rc-rtf=«- 


Now consider the instantaneous rate of growth 

z = u + v where 


of a sum of two functions of time: 

I u = m 

I r = S(0 


This time, the natural log will be 

lnz = ln(« + u) [^lny + lm;] 


Thus 


r 2 = — Inz = — ln(« + u) 
dt dt 

[by (10.20)] 

u + v cl! 

= -}-[/'(*)+2(0] 

U + l? 


But from (10.23) we have r„ = f(t)/f(t ), so that f'(t) - f{t)r u = ur„. Similarly, we 
have g'(t) = vr v . As a result, we can write the rule 


'V.!) 


u 

H + V 


r u + 


u 

——/V 

U +■ V 


(10,26) 


which shows that the rate of growth of a sum is a weighted average of the rates of growth 
of the components. 

By the same token, wc have (see Exercise 10.7-5) 


r (lr t- 1 ) 


U V 

- l 'u 

u — V U - V 


(10.27) 


The exports of goods of a country, C = C(t), has a growth rate of o/t, and its exports of 
services, 5 = 5(t), has a growth rate of b/t. What is the growth rate of its total exports? 
Since Total exports is X(f) = G(t) + S(t), a sum, its rate of growth should be 

G 5 

rx= r c+ K ri 

C /a\ 5 /b\ Ga-\- Sb 

= xit/ + x\t/ = xt 

Finding the Point Elasticity 

We have seen that, given y = /(/), the derivative ol'lny measures the instantaneous rate of 
growth ofv. Now let us sec what happens when, given a function y = f(x), we differenti¬ 
ate Iny with respect to In x, rather than tox. 
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To begin with, let us define u = Iny and v = In.t. Then we can observe a chain of 
relationship linking u to y, and thence to x and v as follows: 

u = \ny y = f{x) x = e hn = e v 

Accordingly, the derivative of In v with respect to In x is 

rf{ln^) du du dy dx 
d(\nx) dv dy dx dv 

= (—In/)/—) (±e’) = !- d -le‘= 1 -‘!l x =‘!Li 

\dy J \dx / \dv / ydx ydx dx y 

But this expression is precisely that of the point elaslicily of the function. Hence we have 
established the genera! principle that, fora function v = /ft), the point elasticity of>'with 
respect to x is 


tf(lny) 

d(ln.r) 


(10.28) 


It should be noted that the subscript yx in this symbol is an indicator that y and .v are the 
two variables involved and does not imply the multiplication ofy and x, This is unlike the 
case of/-( uv) , where the subscript does denote a product. Again, we now have an alternative 
way of finding the point elasticity of a function by use of logarithms, which may often 
prove to be an easier approach, //the given function comes in the form of a multiplicative 
or divisional expression. 


Example 5 ! ^ e P°' nt e l astic 'ty of demand, given that Q = k/P, where k is a positive constant. This 

- is the equation of a rectangular hyperbola (see Fig. 2.8d); and, as is well known, a demand 

function of this form has a unitary point elasticity at all points. To show this, we shall apply 
(10.28). Since the natural log of the demand function is 

In 0= Ink - In P 

the elasticity of demand (Q with respect to P) is indeed 


d(ln Q) 
td cf(ln P) 


or U'c/| = 1 


The result in (10.28) was derived by use of the chain rule of derivatives. It is of interest 
that a similar chain rule holds for elasticities; i.e., given a function v = g(n ), where 
w — h(x). we have 


'H'l — £vu'£ 


(10.29) 


The proof is as follows: 


( dy vv\ idw x \ dy dw vv x dy x 

dw y J \ dx wj dw dx y w dx y 


£)'i.'£>ra — 
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EXERCISE 10.7 

1. Find the instantaneous rate of growth: 

(<?) y-St 2 (c) y = ab : (<?) y = f/B' 

(&)y=ot c (d)y = 2'(t 2 ) 

2. tf population grows according to the function H = Ho(2) w and consumption by the 
function C = Coe 0 ', find the rates of growth of population, of consumption, and of 
per capita consumption by using the natural log. 

3. If y is related to x by y=x k , how will the rates of growth r Y and r„ be related? 

4. Prove that if y = u/v, where u - f(t) and v = g(t), then the rate of growth of y will 
be r y = r u - r v , as shown in {10.25). 

5. The real income y is defined as the nominal income Y deflated by the price level P. 
How is r Y (for real income) related to r Y (for nominal income)? 

6. Prove the rate-of-growth rule (10.27). 

7. Given the demand function Q<j = kjP n , where It and n are positive constants, find the 
point elasticity of demand Sd by using (10.28) (cf. Exercise 8.1 -4). 

8. (a) Given y = wz, where w - g(x) and z = h(x), establish that Gyx ~ *'zx' 

(b) Given y = u/v, where u= C(x) and v= H(x), establish that*,,* = fy* -£«• 

9. Given y= f(x), show that the derivative d(log b yJ/cfClogfjX)—log to base b rather 
than e— also measures the point elasticity f: yr . 

10. Show that, if the demand for money Ma is a function of the national income V = Y(t) 
and the interest rate i = i(f), the rate of growth of Md can be expressed as a weighted 
sum of r Y and r l( 

rMi-^M^ry -Kvi/O 

where the weights are the elasticities of with respect to Y and i, respectively. 

11. Given the production function Q = F (K, L ), find a general expression for the rate of 
growth of.<? in terms of the rates of growth of K and 1. 



Chapter 


The Case of More than 
One Choice Variable 


The problem of optimization was discussed in Chap, 9 within the framework of an objective 
function with a single choice variable. In Chap. 10 the discussion was extended to exponen¬ 
tial objective functions, but we still dealt with one choice variable only. Now we must develop 
a way of finding the extreme values of an objective function that involves two or more choice 
variables. Only then will we be able to tackle the type of problem confronting, say, a multi¬ 
product firm, where the profit-maximizing decision consists of the choice of optimal output 
levels for several commodities and the optimal combination of several different inputs. 

We shall discuss first the case of an objective function of two choice variables, 
z = f(x, y), in order to take advantage of its graphabilily. Later the analytical results can 
be generalized to the nongraphable n-variable case. Regardless of the number of variables, 
however, we shall assume in general that, when written in a general form, our objective 
function possesses continuous partial derivatives to any desired order. This will ensure the 
smoothness and differentiability of the objective function as well as its partial derivatives. 

For functions of several variables, extreme values are again of two kinds: (1) absolute or 
global and (2) relative or local. As before, our attention will be focused heavily on relative 
extrema, and for this reason wc shall often drop the adjective “relative," with the under¬ 
standing that, unless otherwise specified, the extrema referred to are relative. However, in 
Sec. 11.5, conditions for absolute extrema will be given due consideration. 

11,1 The Differential Version of Optimization Conditions 

The discussion in Chap. 9 of optimization conditions for problems with a single choice 
variable was couched entirely in terms of derivatives, as against differentials. To prepare 
for the discussion of problems with two or more choice variables, it would be helpful also 
to know how those conditions can equivalently be expressed in terms of differentials. 

First-Order Condition 

Given a function z = /(x), we can, as explained in Sec. 8.1, write the differential 

dz = f’(x)dx (11,1) 
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FIGURE 11.1 



0 > 


and use dz as an approximation to the actual change, Az, induced by the change of x from 
xn to xq + Ax ; the smaller the Ax, the better the approximation. From (11.1), it is clear that 
if fix) > 0, dz and dx must take the same algebraic sign; this is illustrated by point A in 
Fig. 11.1 (ef, Fig. 8.1b). In the opposite case where fix) < 0, exemplified by point A\dz 
and dx take opposite algebraic signs. Since points like A and A' —where / (x) f 0 and 
hence dz -f 0—cannot qualify as stationary points, it stands to reason that a necessary 
condition for z to attain an extremum (a stationary value) is dz = 0. More accurately, the 
condition should be stated as “dz = 0 for an arbitrary nonzero dx” since a zero dx (no 
change in x) has no relevance in our present context. In Fig. 11.1, a minimum of z occurs at 
point B, and a maximum of z occurs at point B'. In both cases, with the tangent line hori¬ 
zontal, i.c., with fix) = 0 there, dz (the vertical side of the triangle formed with the tan¬ 
gent line as the hypotenuse) indeed reduces to zero. Thus the first-order derivative condi¬ 
tion “ f'{x) = 0” can be translated into the first-ordCT differentia / condition “dz - 0 for an 
arbitrary nonzero dx.” Bear in mind, however, that while this differential condition is 
necessary for an extremum, it is by no means sufficient, because an inflection point such as 
C'in Fig. 11.1 can also satisfy the condition that dz - 0 for an arbitrary nonzero dx. 

Second-Order Condition 

The second-order sufficient conditions for extrema of z are, in terms of derivatives, 
f’\x) < 0 (for a maximum) and f"(x) > 0 (for a minimum) at the stationary point. To 
translate those conditions into differential equivalents, we need the notion of second-order 
differential, defined as the differential of a differential, i.e., d(dz), commonly denoted by 
d 2 : (read: “d- two z”). 

Given that dz = f\x)dx, we can obtain d 2 z merely by further differentiation oi'dz. In 
so doing, however, we should bear in mind that dx, representing in this context an arbitrary 
or given nonzero change in x. is to be treated as a constant during differentiation. Conse¬ 
quently, dz can vary only with /'(*), but since f(x) is in turn a function ofx. dz can in the 
final analysis vary only with x. I n view of this, wc have 

d 2 z = d{dz) = d[f\x) dx] [by (11.1)] 

= [df\x)] dx [dx is constant] 

= [f"{x)dx]dx = f"(x)dx 2 


( 11 . 2 ) 



Chapter 1 1 The Case of Mon* than One Choice Variable 293 


Note that the exponent 2 appears in (11.2) in two fundamentally different ways. In the sym¬ 
bol cl 2 z, the exponent 2 (read: “two”) indicates the second-order differential of 2 ; but in the 
symbol c lx 1 = {(h) 2 , the exponent 2 (read: “squared”) denotes the squaring of the first- 
order differential dx. The result in (11.2) provides a direct link between d 2 z and f"(x). 
Inasmuch as we are considering nonzero values of dx only, the dx 2 term is always positive; 
thus d 2 z and f"(x) must take the same algebraic sign. Just as a positive (negative) f '"{x) at 
a stationary point delineates a valley (peak), so must a positive (negative) d 2 z at such a 
point. 

It follows that the derivative condition “f"(x) < 0 is sufficient for a maximum ofcan 
equivalently be stated as the differential condition ‘V/ 2 : < 0 for an arbitrary nonzero dx is suf¬ 
ficient for a maximum of 2 .” The translation of condition for a minimum of r is analogous; we 
just need to reverse the sense ofinequality in the preceding sentence. Going one step further, 
we may also conclude on the basis of (11.2) that the second-order necessary conditions are 


Formaximumofz: /"(: x) < 0 
For minimum of z: f"(x) > 0 


can be translated, respectively, into 


Formaximumofz: d 2 z < 0 
Kor minimum of r: d 2 z > 0 


for arbitrary nonzero values of dx 


Differential Conditions versus Derivative Conditions 

Now that we have demonstrated the possibility of expressing the derivative version of first- 
and second-order conditions in terms of dz and d 2 z, you may very well ask why we both¬ 
ered to develop a new set of differential conditions when derivative conditions were already 
available. The answer is that differential conditions—but not derivative conditions—are 
stated in forms that can be directly generalized from the one-variable case to cases with two 
or more choice variables. To be more specific, the first-order condition (zero value for dz) 
and the second-order condition (negativity or positivity for d 2 z) are applicable with equal 
validity to all cases, provided the phrase “for arbitrary nonzero values of dx” is duly modi¬ 
fied to reflect the change in the number of choice variables. 

This does not mean, however, that derivative conditions will have no further role to play. 
To the contrary, since derivative conditions are operationally more convenient to apply, we 
shal I—after the generalization process is carried out by means of the differential conditions 
to cases with more choice variables—still attempt to develop and make use of derivative 
conditions appropriate to those cases. 


11.2 Extreme Values of a Function of Two Variables 


For a function of one choice variable, an extreme value is represented graphically by the 
peak of a hill or the bottom of a valley in a two-dimensional graph. With two choice vari¬ 
ables. the gTaph of the function— z - f{x, v)—becomes a surface in a 3-space, and while 
the extreme values are still to be associated with peaks and bottoms, these “hills” and 
“valleys" themselves now take on a three-dimensional character. They will, in this now 
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FIGURE 11.2 



context, be shaped like domes and bowls, respectively. The two diagrams in Fig. 11.2 serve 
to illustrate. Point A in diagram a, the peak of a dome, constitutes a maximum: the value of 
z at this point is larger than at any other point in its immediate neighborhood. Similarly, 
point B in diagram b, the bottom of a bowl, represents a minimum; everywhere in its 
immediate neighborhood the value of the function exceeds that at point B. 

First-Order Condition 

For the function 


z = /(*,>’) 

the first-order necessary condition for an extremum (cither maximum or minimum) again 
involves dz = 0. But since there arc two independent variables here, dz is now a total 
differential; thus the first-order condition should be modified to the form 

dz = 0 for arbitrary values of dx and dy, not both 7ero (11.3) 

The rationale behind (11.3) is similar to the explanation of the condition dz = 0 for the 
one-variable case: an extremum point must be a stationary point, and at a stationary point, 
dz as an approximation to the actual change A: must be zero for arbitrary dx and dy, not 
both zero. 

In the present two-variable case, the total differential is 

dz = f : dx + fydy (II* 4 ) 

In order to satisfy condition (11.3), it is necessary-and-sufficient that the two partial deriv¬ 
atives f x and f y be simultaneously equal to zero. Thus the equivalent derivative version of 
the first-order condition (11.3) is 


Jr = fy = 0 


dz 

or — 
dx 



(11.5) 


There is a simple graphical interpretation of this condition. With reference to point A in 
Fig. 11.2a, to have f\ = 0 at that point means that the tangent line T x , drawn through A and 
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FIGURE 11,3 



uO 



parallel to thexz plane (holding v constant), must have a zero slope. By the same token, to 
have f y = 0 at point .4 means that the tangent line T y . drawn through .4 and parallel to the 
yz plane (holding x constant), must also have a zero slope. You can readily verify that these 
langcnt-linc requirements actually also apply to the minimum point B in Fig. 11.26. This is 
because condition (11.5), like condition (11.3), is a necessary condition for both a maxi¬ 
mum and a minimum. 

As in the earlier discussion, the first-order condition is necessary, but not sufficient. That 
it is not sufficient to establish an extremum can be seen from the two diagrams in Fig. 11.3. 
At point C in diagram a, both T x and T y have zero slopes, but this point does not qualify as 
an extremum: Whereas it is a minimum when viewed against the background of the yz 
plane, it turns out to be a maximum when looked at against the xz plane! A point with such 
a “dual personality'' is referred to, for graphical reasons, as a saddle point. Similarly, point 
D in Fig. 11.36, while characterized by flat T x and T y , is no extremum, cither; its location 
on the twisted surface makes it an inflection point, whether viewed against the xz or the yz 
plane. These counterexamples decidedly rule out the first-order condition as a sufficient 
condition for an extremum. 

To develop a sufficient condition, we must look to the second-order total differential, 
which is related to second-order partial derivatives. 


Second-Order Partial Derivatives 

The function z = /(x, y) can give rise to two first-order partial derivatives. 


Since / t is itself a function of x (as well as ofy), we can measure the rate of change of/, 
with respect to x, while y remains fixed, by a particular second-order (or second) partial 
derivative denoted by cither f xx or flz/bx 2 : 


Lx = or 

dx 


d 2 z d / 'dz 
fix 2 dx 1 dx 


The notation f xx has a double subscript signifying that the primitive function / has been 
differentiated partially with respect to x twice, whereas the notation d 2 z/dx 2 resembles that 
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of d 2 z/dx 2 except for the use of the partial symbol. In a perfectly analogous manner, we 
can use the second partial derivative 


fyy = TT“(./v) Or 

dy 


b 2 z <) /3i\ 
fly 2 by \ by J 


to denote the rate of change of f y with respect to y 9 while * is held constant. 

Recall, however, that f t is also a function ofy and that f y is also a function of*. Hence, 
there can be written two more second partial derivatives: 

. b 2 z b (Hz\ d l z b (bz\ 

U = TTT~ - — l — I and fr* - 


bx by bx \ by 


by bx by \ bx 


These are called cross (or mixed) partial derivatives because each measures the rate of 
change of one firsl-ordeT partial derivative with respect to the “other” variable, 

It bears repeating that the second-order partial derivatives of z = /'(*, y) .like 2 and the 
first derivatives f x and / r , are also functions of the variables x and y. When that fact re¬ 
quires emphasis, we can write f xx as y}> and f xy as / u ,(a\ y), etc. And, along the 
same line, we can use the notation f yx ( 1,2) to denote the value of / vv evaluated at x = 1 
and y = 2, etc. 

Even though I A}' and f yx have been separately defined, they will—according to a propo¬ 
sition known as Young s theorem - have identical values, as long as the two cross partial de¬ 
rivatives are both continuous. In that ease, the sequential order in which partial differentia¬ 
tion is undertaken becomes immaterial, because f xv = / >v - For the ordinary types of 
specific functions with which we work, this continuity condition is usually met; fox genera! 
functions, as mentioned earlier, we always assume the continuity condition to hold. Hence, 
wc may in general expect to find identical cross partial derivatives. In fact, the theorem ap¬ 
plies also to functions of three or more variables. Given z = g(u, v, w). for instance, the 
mixed partial derivatives will be characterised by g uv =jTw,.£W = £„■»», etc., provided 
these partial derivatives arc all continuous. 


Example 1 ^ nc * four second-order partial derivatives of 

z = x 3 + Sxy- y 2 

The first partial derivatives of this function are 

f x = 3x 2 + 5 y and f Y = Sx - 2y 
Therefore, upon further differentiation, we get 

fxx — 6x fyx =5 fxy ~ 5 fyy = —2 

As expected, f yx and f x y are identical. 


Example 2 Finc! al|l:he$e(;ond partial derivatives of z= x 2 e' Y . In this case, the first partial derivatives are 

f x = 2xe v and f y = x 2 e~ Y 

Thus we have 

fyx = -2xe y fyy = -2xe y f yy = 

Again, we see that f yx = f xy . 
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Note that the second partial derivatives are all functions of the original variables and;\ 
This fact is clear enough in Example 2, but it is true even for Example 1, although some 
second partial derivatives happen to be constant functions in that case. 

Second-Order Total Differential 

Given the total differential dz in (11.4), and with the concept of second-order partial deriv¬ 
atives at our command, we can derive an expression for the second-order total differential 
d 2 z by further differentiation of dz. In so doing, we should remember that in the equation 
dz = f x dx -b f y dy\ the symbols dx and dy represent arbitrary or given changes in x andy; 
so they must be treated as constants during differentiation. As a result, dz depends only on 
f and /„, and since f x and f y are themselves functions of x and y, dz, like z itself, is a 
function ofx andy. 

To obtain d 2 z , we merely apply the definition of a differential—as shown in (11.4)—to 
dz itself. Thus, 

d 2 z = d(dz)='^-dx^‘^f-dv [cf. (11.4)] 
dx dy 

= —(/, dx + f y dy) dx + —(,/* dx + f r dy) dy 
dx By 

~ (fxx dx + fxy dy) dx + (f vx dx + j vy dy) dy 

= fxx dx 2 4- f xy dy dx + f yx dx dy + f yy dy 2 

= fsx dx 2 + 2fxy dx dy + f vy dy 2 [f xy = /„ ] (1 i .6) 


Note, again, that the exponent 2 appears in (11.6) in two different ways. In the symbol d 2 z , 
the exponent 2 indicates the second-order total differential of c; but in the symbol 
dx 2 = {dx) 2 , the exponent denotes the squaring of the first-order differential dx. 

The result in (11.6) shows the magnitude of d 2 z in terms of given values of dx and dy, 
measured from some point Uo.yo) in the domain. In order to calculate d 2 z. however, 
we also need to know the second-order partial derivatives f xx , f xy , and f jV , all evaluated 
at (cco, >o)—just as we need to know the first-order partial derivatives to calculate dz 
from (11.4). 


Example 3 


Given z = x l + 5xy - y 2 , find dz and d 2 z. This function is the same as the one in Example 1. 
Thus, substituting the various derivatives already obtained there into (11.4) and (11.6), 
we find* 


dz-(lx 1 + 5y) dx + (Sx - Zy) dy 


1 An alternative way of reaching these results is by direct differentiation of the function: 

dz=d(x i ) + d(Sxy)-d(y z ) 

= lx 2 dx + 5y dx + 5x dy - 2y dy 

Further differentiation of dz (bearing in mind that dx and dy are constants) will then yield 
d 2 z = d(lx 2 ) dx + d(5y) dx + d(5x) dy - d(2y) dy 
= (6* dx) dx + (5 dy) dx + (5 dx) dy (2 dy) dy 
= 6x dx 2 + 10dx dy-2dy 2 
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and 


d 2 z = 6x dx 2 + 10dx dy-2dy 2 

We can also calculate dzand d 2 z at specific points in the domain. At the point x = 1 and 
y= 2, for instance, we have 

dz = 1B dx + dy and d 2 z = 6 dx 2 I- 10 dx dy - 2 dy 2 


Second-Order Condition 

In the one-variable case, cl 1 : < Oat a stationary point identifies the point as the peak of a hill 
in a 2-space. Similarly, in the two-variable case. d 2 z < 0 at a stationary point would identify 
the point as the peak of a dome in a 3-space. Thus, once the first-order necessary condition is 
satisfied, the second-order sufficient condition for a maximum of z = f(x, v) is 

d 2 z < 0 for arbitrary values of dx and dy, not both zero (11.7) 

A positive d 1 : value at a stationary point, on the other hand, is associated with the bottom 
of a bowl. The second-order sufficient condition for a minimum of 2 — f(x, 1 ) is 

d 2 z > 0 for arbitrary values of dx and dy, not both zero (11.8) 

The reason why (11.7) and (11-8) are only sufficient, but not necessary, conditions is 
that it is again possible for d 2 z to take a zero value at a maximum or a minimum. For 
this reason, second-order necessary conditions must be stated with weak inequalities as 
follows: 


For maximum ofz: 
For minimum of z: 


d 2 z< U 
d 2 z > 0 


for arbitrary values of dx and dy. not both zero 


(11,9) 


In the following, however, we shall pay move attention to the second-order sufficient 
conditions. 

For operational convenience, second-order differential conditions can be translated into 
equivalent conditions on second-order derivatives. In the two-variable case, (11.6) shows 
that this would entail restrictions on the signs of the second-order partial derivatives 
/,,. )\ y , and . The actual translation would require a knowledge of quadratic forms, 
which will be discussed in Sec. 11.3. But we may first introduce the main result here: For 
any values of dx and dy, not both zero, 

2 I < 0 iff ,/i-i- < 0: f yy < 0; and /„ f vy > 

i '• ^ iff f xx > 0: f vy > 0; and > f 2 , 


Note that the sign of d 2 z hinges not only on f xx and which have to do with the surface 
configuration around point A (Fig. 11.4) in the two basic directions shown by 7) (east-west) 
and T r (north-south), but also on the cross partial derivative The role played by this lat¬ 
ter partial derivative is to ensure that the surface in question will yield (two-dimensional) 
cross sections with the same type of configuration (hill or valley, as the case may be) not 
only in the two basic directions (east-west and north-south), but in all other possible direc¬ 
tions (such as northeast-southwest) as well. 
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FIGURE 11.4 



This result, together with the first-order condition (11.5), enables us to construct 
Table 11.1. It should be understood that all the second partial derivatives therein are to 
be evaluated at the stationary point where f x - f v = 0 . It should also be stressed that 
the second-order sufficient condition is not necessary for an extremum. In particular, if a 
stationary value is characterized by f xx f yy = f 2 y in violation of that condition, that sta¬ 
tionary value may nevertheless turn out to be an extremum. On the other hand, in the case 
of another type of violation, with a stationary point characterized by f xx f j., < /j 2 , we can 
identify that point as a saddle point, because the sign of d 2 z will in that case be indefinite 
(positive for some values of dx and dy, but negative for others). 


Example 4 Find the extreme value(s) of z = 8x 3 + 2xy - 3x 2 + y 2 + 1 . First let us find all the first and 
- second partial derivatives: 

f, - 24x 2 + 2y - 6x f r = 2x + 2y 

lXX — 48X — 6 fyy — 2 fxy = 2 

The first-order condition calls for satisfaction of the simultaneous equations f x = 0 and 
f y — 0; that is, 

24x 2 + 2y - 6x = 0 
2y + 2x = Q 

The second equation implies that y = -x, and when this information is substituted into the 
first equation, we get 24x 2 - 8x = 0, which yields the pair of solutions 


x) = 0 


*2 = 3 


[implying y* = -x* = 0] 
implying y 2 ‘ = -l 


To apply the second-order condition, we note that, when 


x,* = yt=0 
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TABLE 11.1 

Conditions for 

Condition 

Maximum 

Minimum 

Relative 

First-order necessary condition 

f,= fy= 0 

U* fy — 0 

Extremum: 

Second-order sufficient condition* 

Ux> fyy < 0 

f>r.i fyy > 0 

i =/<*<>■) 


and 

fxx fyy > fxy 

and 

fxx fyy > f X y 


4 Applicable only after the first-order necessary condition has been satisfied. 


turns out to be -6, while f yY is 2, so that f, x f yy is negative and is necessarily less than 
a squared value f} Y - This fails the second-order condition. The fact that f x> and f n have 
opposite signs suggests, of course, that the surface in question will curl upward in one 
direction but downward in another, thereby giving rise to a saddle point. 

What about the other solution? When evaluated at*^ = - 3 , we find that 1 XX = 10, which, 
together with the fact that f YY = f xy = 2, meets all three parts of the second-order sufficient 
condition fora minimum. Therefore, by setting x = ] and y = - in the given function, we 
can obtain as a minimum of zthe value z* = In the present example, there thus exists only 
one relative extremum (a minimum), which can be represented by the ordered triple 


(* 




1 -1 23 \ 
J' 3~' 27/ 


Example S 


Find the extreme value(s) of z = x + 2ey - - e 2y . The relevant derivatives of this function 

are 

f x = 1 -e" fy = 2e-2e ly 
f»= fyy = -4e 2y f, y = 0 

To satisfy the necessary condition, we must have 

1 -e* = 0 
2e - 2e ly = 0 

which has only one solution, namely, x m = 0 and y* = \ • To ascertain the status of the value 
of z corresponding to this solution (the stationary value), we evaluate the second-order de¬ 
rivatives at x = 0 and y = j, and find that f, x = -1, f yy = -4e, and ( xy = 0. Since f KX and 
fyy are both negative and since, in addition, (-1)(-4e) > 0, we may conclude that the z 
value in question, namely, 

z- = =-1 

is a maximum value of the function. This maximum point on the given surface can be 
denoted by the ordered triple ( x\ y\ t ) = <0, -1). 

Again, note that, to evaluate the second partial derivatives at x" and f, differentiation 
must be undertaken first, and then the specific values of x' and f are to be substituted into 
the derivatives as the final step. 


EXERCISE 11.2 

Use Table 11.1 to find the extreme value(s) of each of the following four functions, and 
determine whether they are maxima or minima: 

1. i= x 2 + xy + 2y 2 + 3 

2. z= - x 2 - y 2 + 6x + 2y 
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3. z = ax 2 + by 2 + c; consider each of the three subcases: 

(a) a > 0, b > 0 (6) a < 0, b < 0 (c) a and b opposite in sign 

4. z = e 2 * - 2* + ly 2 + 3 

5. Consider the function z- (x- 2) 4 - (y- 3) 4 . 

(o) Establish by intuitive reasoning that z attains a minimum { z' = 0) at x* = 2 and 

K‘ = 3. 

(b) Is the first-order necessary condition in Table 11.1 satisfied? 

(c) Is the second-order sufficient condition in Table 11.1 satisfied? 

(d) Find the value of d 2 z. Does it satisfy the second-order necessary condition for a 
minimum in (11.9)? 


11.3 Quadratic Forms—An Excursion 

The expression ford 2 - on the last fine of (11.6) exemplifies what are known as quadratic 
forma, for which there exist established criteria for determining whether their signs are 
always positive, negative, nonpositive, or nonnegative, for arbitrary values ol'r/.r and dy, not 
both zero. Since the second-order condition for extremum hinges directly on the sign of 
if:. those criteria are of direct interest. 

To begin with, wc define a form as a polynomial expression in which each component 
term has a uniform degree. Our earlier encounter with polynomials was confined to the 
case of a single variable: an + <t\x + • • • -f- a„x n . When more variables are involved, each 
term of a polynomial may contain either one variable or several variables, each raised to a 
nonnegative integer power, such as 3 a- + 4x V - 2 yz. In the special case where each term 
has a uniform degree—i.e., where the sum of exponents in each term is uniform—the 
polynomial is called a form. For example, 4x - 9 v + r is a linear form in three variables, 
because each of its terms is of the first degree. On the other hand, the polynomial 
4.r- - xu -f 3>’\ in which each term is of the second degree (sum of integer exponents =2), 
constitutes a quadratic form in two variables. We may also encounter quadratic forms in 
three variables, such as.v 2 + 2xv - vw — 7tv 2 , or indeed in n variables. 


Second-Order Total Differential as a Quadratic Form 

If we consider the differentials dx and dy in (11.6) as variables and the partial derivatives as 
coefficients, i.e., if wc let 

u = dx v = dy 

a = f xx b = fy } h = f <v [=f,f 
then the second-order total differential 

d 2 z - dx 1 + 2 /;, dx dy 4- dy 1 
can easily be identified as a quadratic form q in the two variables u and v: 

q = au ‘ + 2 huv 4 - bv 2 


( 11 . 6 ') 
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Note that, in this quadratic form, dx = u and dy = v arc cast in the role of variables, 
whereas the second partial derivatives are treated as constants—the exact opposite of the 
situation when we were differentiating d: lo get d 2 z. The reason for this role reversal lies in 
the changed nature of the problem we are now dealing with. The second-order sufficient 
condition for extremum stipulates d 2 z to be definitely positive (for a minimum) and defi¬ 
nitely negative (for a maximum), regardless of the values that dx and dy may take (so long 
as they are not both zero). It is obvious, therefore, that in the present context dx and dy must 
be considered as variables. The second partial derivatives, on the other hand, will assume 
specific values at the points we arc examining as possible extremum points, and thus may 
be regarded as constants. 

The major question becomes, then: What restrictions must be placed upon a, b, and h 
in (11.6'), while w and u are allowed to take any values, in order lo ensure a definite sign 
for q> 


Positive and Negative Definiteness 

As a matter of terminology, let us remark that a quadratic form q is said to be 


Positive definite 
Positive semi definite 
Negative semidefinite 
Negative definite 


if q is invariably 


positive 

nonnegativc 

nonpositive 

negative 


(> 0 ) 
(> 0 ) 
(£ 0 ) 
(< 0 ) 


regardless of the values of the variables in the quadratic form, nol all zero. II q changes 
signs when the variables assume different values, on the other hand, q is said to be indefinite. 
Clearly, the cases of positive and negative definiteness of q = d 2 z are related to the second- 
order sufficient conditions for a minimum and a maximum, respectively. The cases ol 
.semidefiniteness, on the other hand, relate to second-order necessary- conditions. When 
q = d-z is indefinite, we have the symptom of a saddle point. 


Determinantal Test for Sign Definiteness 

A widely used test for the sign definiteness of q calls for the examination ofthc signs of cer¬ 
tain determinants. This test happens to be more easily applicable to positive and negative- 
definiteness (as against semidefinitencss); that is, it applies more easily to second-order 
sufficient (as against necessary) conditions. We shall confine our discussion here to the 
sufficient conditions only.* 

for the two-variable case, determinantal conditions for the sign definiicness of q are 
relatively easy to derive. In the first place, we see that the signs of the first and third terms 
in (11,6') are independent of the values of the variables u and r, because these variables 
appear in squares. Thus it is easy to specify the condition for the positive or negative defi¬ 
niteness of these terms alone, by restricting ihe signs of a and b. The trouble spot lies in 
the middle term. But if we can convert the entire polynomial into an expression such that 
the variables u and v appear only in some squares, the definiteness of the sign of q will 
again become tractable. 

f For a discussion of a determinantal test for second-order necessary conditions, see Alpha C. Chiang, 
Elements of Dynamic Optimization, Waveland Press Inc., 1992, pp. 85-90. 
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The device that will do the trick is that of completing the square. By adding h 2 v 2 /a 
to, and subtracting the same quantity from, the right side of (11.6'), we can rewrite the 
quadratic form as follows: 

h2 (fl 

q = au 2 + 2huv q— v 2 + bv 2 - v 2 


a 

/ 2 2 h h 2 ^ 

= a [a H- uv + — v 

\ a a 1 


a 


h 


= a \u + -i; 
v a 


ab — h 2 , 

+-(t' 2 ) 

a 


Now that the variables u and v appear only in squares, we can predicate the sign of 
q entirely on the values of the coefficients a , b , and h as follows: 


q is 


positive definite 
negative definite 


iff 


* >0 
a<0 


and ab-h 2 > 0 ( 11 . 11 ) 


Now that (1) ab - h 1 should be positive in both cases and (2) as a prerequisite for the pos¬ 
itivity of ab - h 2 , the product ab must be positive (since it must exceed the squared term 
h 2 )\ hence, this condition automatically implies that a and b must take the identical alge¬ 
braic sign. 

The condition just derived maybe stated more succinctly by the use of determinants. Wc 
observe first that the quadratic form in (i 1.6') can be rearranged into the following square, 
symmetric format: 

q = a(u 2 ) + h(uv) 

+ h(vu) + b{v 2 ) 

with the squared terms placed on the diagonal and with the 2huv term split into two equal 
parts and placed off the diagonal. The coefficients now form a symmetric matrix, with a 
and b on the principal diagonal and h off the diagonal. Viewed in this light, the quadratic 
form is also easily seen to be the 1 x 1 matrix (a scalar) resulting from the following matrix 
multiplication: 


q = [u u] 


Note that this is a more generalized case of the matrix product x'Ax discussed in Sec. 4.4, 
Example 5. In that example, with a so-called diagonal matrix (a symmetric matrix with 
only zeros as its off-diagonal elements) as A, the product x'Ax represents a weighted sum 
of squares. Here, with any symmetric matrix as A (allowing nonzero off-diagonal elements 
to appear), the product x'Ax is a quadratic form. 

h 


a 

h' 

u 

h 

b 

V 


The determinant of the 2 x 2 coefficient matrix, 


h b 


-which is referred to as the 


discriminant oi ihn quadratic form q, and which we shall therefore denote by D—supplies 
the clue to the criterion in (11.] 1), for the latter can be alternatively expressed as: 


{ positive definite 
negative definite 


iff' 

[M>o 

and 

a h 


H«l<o 


h b 


>0 ( 11 . 11 ') 
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Example 1 


Example 2 


The determinant |a| = a is simply the first leading principal minor of |£>], The deter¬ 
minant ^ ^ is, on the Other hand, the second leading principal minor of | D\. In the 

present case, there are only two leading principal minors available, and their signs will 
Serve to determine the positive or negative definiteness of q. 

When (1 Ml') is translated, via (11,10). into terms of the second-order total differential 
d 2 z. we have 


d'z is 


positive definite | 

and 

jxx fxv 

negative definite) 

U <0 

fxy Jyy 


= fxxfyr - fly > 0 


Recalling that the latter inequality implies that f xx and f vy are required to take (lie same 
sign, we see that this is precisely the second-order sufficient condition presented in 
Table 11.1. 

In general, the discriminant of a quadratic form 

q = air + 2huv In: 2 


is the symmetric determinant 


a h 
h b 


. In the particular case of the quadratic form 


d 2 z — Jxx dx‘ 4- 2 j,y dx dy -t- jiy dv 2 

the discriminant is a determinant with the second-order partial derivatives as its elements. 
Such a determinant is called a Hessian determinant (or simply a Hessian). In the two- 
variable case, the Hessian is 


Iffl- 


/v t fxy 
Jyx jyy 


which, in view of Young's theorem {f xy = f yx )> is symmetric—as a discriminant should 
be. You should carefully distinguish the Hessian determinant from the Jacobian determi¬ 
nant discussed in See. 7.6. 

Is q = 5u 2 -f 3uv+ 2v 2 either positive or negative definite? The discriminant of q is 
5 1 5 1 

, with leading principal minors 


1.5 2 

i 

5 > 0 and 

Therefore q is positive definite. 


1.5 


1.5 2 


= 7.75 > 0 


Given f xx = -2, f, Y = 1, and f YY = -1 at a certain point on a function z= f(x, y), does 
d 2 z have a definite sign at that point regardless of the values of dx and dy? The discriminant 

-2 1 
1 -1 


of the quadratic form d 2 z is in this case 

-2 < 0 and 


-2 


, with leading principal minors 

1! 


l -l 


= 1 >0 


Thus d 2 z is negative definite. 
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Three-Variable Quadratic Forms 

Can similar conditions be obtained for a quadratic form in three variables? 

A quadratic form with three variables «], u 2 , and U] may be generally represented as 

q(u u u 2 ., w 3 ) ~ d\ | (mj) + d\i(u [ v 1 )-\-d\ 2 {u\u i ) 

+ d 2 \(u 2 u\) +dn(u 2 2 ) +d 2 }{u 2 u } ) 

+ d)} (W;,Kl) + dy 2 {u 2 Mr) + ^63(^5) 

3 3 

= ^ (11.12) 

1=1 j =1 

where the double- £ (double-sum) notation means that both the index i and the index j are 
allowed to take the values 1,2, and 3; and thus the double-sum expression is equivalent to 
the 3 x 3 array shown in Bq. (11,12), Such a square array of the quadratic form is, inci¬ 
dentally, always to be considered a symmetric one, even though wc have written the pair of 
coefficients (d 12 , di\) or (t/ 23 , ^ 32 ) us if the two members of each pair were different. For if 
the term in the quadratic form involving the variables u\ and u 2 happens to be, say \2u\u 2 , 
we can let d\ 2 = d 2 \ = 6, so that d\ 2 U)U 2 = 1 / 21 “ 2 “ l s and a similar procedure may be ap¬ 
plied to make the other off-diagonal elements symmetrical. 

Actually, this three-variable quadratic form is again expressible as a product of three 
matrices: 


0(H|,«2iM3) = [«I 


Ul 



^11 d\ 2 ^13 


«1 

«3] 

d 2 ] di 2 d 22 , 


Ui 


_<?.!! d}2 dn_ 


1 


= u'Du 


( 11 . 12 ') 


As in the two-variable case, the first matrix (a row vector) and the third matrix (a column 
vector) merely list the variables, and the middle one (D) is a symmetric coefficient matrix 
from the square-array version of the quadratic form in (11.12). This time, however, a total 
of three leading principal minors can be Conned from its discriminant, namely, 


Oil = d n 


0 2 


dn 

di\ 


dn 

dn 


0 3 


d\\ d\2 d\j 

di\ d 22 d 2 } 

ds 1 dn dn 


where |D, | denotes the /th leading principal minor of the discriminant D|. T It turns out that 
the conditions for positive or negative definiteness can again be stated in terms of certain 
sign restrictions on these principal minors. 

By the now-familiar device of completing the square, the quadratic form in (11.12) can 
be converted into an expression in which the three variables appear only as components of 


* We have so far viewed the /th leading principal minor |D;| as a subdeterminant formed by retaining 
the first i principal-diagonal elements of |D|. Since the notion of a minor implies the deletion of 
something from the original determinant, however, you may prefer to view the /th leading principal 
minor alternatively as a subdeterminant formed by deleting the last (n-i) rows and columns of |Di. 
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Example 3 


Example 4 


some squares. Specifically, recalling that a\i = ai\, etc., we have 


, / c/j2 rf]? \ , d)\dn — d}->{ d\\dii, - dadw 
q = d\\[u\± —-u 2 + —«3 H-j-- u 2 H-r~;-32~ 

\ rfn d ii / an V d\\dii"d l n 

d\\dnd^ —d\\diy - di 2 d\^ -d^d\ 2 + 1d\td\yd 2 y 2 

H -- n—a - (W3) 

“]l “22 “ «i2 


This sum of squares will be positive (negative) for any values of wi, u 2y and not all zero, 
if and only if the coefficients of the three squared expressions are all positive (negative). 
But the three coefficients (in the order given) can be expressed in terms of the three lead¬ 
ing principal minors as follows: 

m m 
1 1 m m 

Hence, for positive definiteness, the necessary-and-sufficient condition is threefold: 

IAI >0 

\Dj\ > 0 [given that \D\ | > 0 already] 

|Z5j| > 0 [given that D 2 I > 0 already] 

In other words, the three leading principal minors must all be positive. For negative 
definiteness, on the other hand, the necessary-and-sufficient condition becomes: 

I Di| <0 

|D 2 |>0 [given that IDi I < 0 already] 

Ifi <0 [given that | D 2 1 >0already] 

That is, the three leading principal minors must alternate in sign in the spccilied manner. 


Determine whether q = + 6u§ + 3u^ - 2ui U 2 - 4u 2 u 3 is either positive or negative defi¬ 

nite. The discriminant of q is 

1 -1 0 
-1 6 -2 
0 -2 3 

with leading principal minors as follows: 


1 

1 1 - 1 
-1 6 


i -i 

0 

I > 0 | 

= 5 > 0 and 

-1 6 

0 -2 

-2 

3 


Therefore, the quadratic form is positive definite. 


Determine whether q = 2u 2 + 3v 2 - w 2 4 - 6uv - 8uw - 2vw is either positive or negative 

2 3 -4 

3 3 -1 

-4 -1 -1 

principal minor to be 2 > 0, but the second leading principal minor is 


definite. The discriminant may be written as 


, and we find its first leading 
= -3 < 0. 


2 3 

3 3 
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This violates the condition for both positive and negative definiteness; thus q is neither 
positive nor negative definite. 


n-Variable Quadratic Forms 

As an extension of the preceding result to the n- variable case, we shall state without proof 
that, for the quadratic form 

n n 

q(u u u 2 ,.... v„) = [where d n = d/,] 

'=' j=\ 

= u D u [cl'. (11.12')] 

(lx«) (« X 1) 

the necessary-3nd-sulTicient condition for positive definiteness is that the leading principal 
minors of \D\ y namely, 







1 d\\ 

d u - 

-■ dm 

tfil =d u 

\D 2 \ = , 

du 

d 2 ] 

dn 

^22 

■■■ |D„I = 

di\ 

d 2 i ■ 

• • di„ 






dnl 

d„2 • 

" d„„\ 


all be positive. The corresponding nocessary-and-sufficient condition for negative definite¬ 
ness is that the leading principal minors alternate in sign as follows: 

IAI < 0 m >0 \D>\ < 0 (etc.) 

so that all the odd-numbered ones are negative and all even -numbered ones are positive. 
ThCflth loading principal minor, \D n \ = \D\ f should be positive if w is even, but negative if 
n is odd. This can be expressed succinctly by the inequality (-1 )* | D n \ > 0. 


Characteristic-Root Test for Sign Definiteness 

Aside from the preceding determinantal test for the sign definiteness of a quadratic form 
u'Du, there is an alternative test that utilizes the concept of the so-called characteristic 
roots of the matrix D. This concept arises in a problem of the following nature. Given an 
n x n matrix £>, can we find a scalar and an n x 1 vector x ^ 0, such that the matrix 
equation 


D x =r x (11.13) 

(nxu) C«xl) («x1) 

is satisfied? If so, the scalar r is referred to as a characteristic root of matrix D and x as a 
characteristic vector of that matrix. T 

The matrix equation Dx = rx can be rewritten as Dx - r lx = 0, or 

(D-rl)x = 0 where 0 is n x 1 (11.13') 


* Characteristic roots are also known by the alternative names of latent roots, or eigenvalues. 
Characteristic vectors are also called eigenvectors. 
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This, of course, represents a system of ai homogeneous linear equations. Since we want a 
nontrivial solution for x, the coefficient matrix (D — rl )—called the characteristic matrix 
ot'D— is required to be singular. In other words, its determinant must be made to vanish: 


</][-/• 

<hl ’ 

d\ n 



c/21 d22 ~ t 4 

din 

= 0 

(11.14) 

d n \ 

dni 

■■ d„„- r 




Example 5 


Equation (11.14) is called the characteristic equation of matrix IX Since the determinant 
\D - r!\ will yield, upon Laplace expansion, an nth-degree polynomial in the variable r, 
(11.14) is in fact an nth-degree polynomial equation. There will thus be a total of n roots, 
In,..., r n ) 1 each of which qualifies as a characteristic root. If I) is symmetric, as is the 
case in the quadratic-form context, the characteristic roots will always turn out to be real 
numbers, but they can take either algebraic sign, or be zero. 

Inasmuch as these values of r will all make the determinant \D - r!\ vanish, the substi¬ 
tution of any of these (say, r ,) into the equation system (11.13') will produce a correspond¬ 
ing vector x | r =r, ■ More accurately, the system being homogeneous, it will yield an infinite 
number of vectors corresponding to the root . We shall, however, apply a process of nor¬ 
malization (to be explained in Example 5) and select a particular member of that infinite set 
as the characteristic vector corresponding to this vector will be denoted by ty. With a 
total of n characteristic roots, there should be a total of n such corresponding characteristic 
vectors. 


Find the characteristic roots and vectors of the matrix 
matrix for D in (11.1 A), we get the equation 


By substituting the given 


2-r 2 

2 -1 -r 


- r - 6 = 0 


with roots ri = 3 and ri = -2. When the first root is used, the matrix equation (11.13') 
takes the form of 


'2-3 

2 

*r 



2 ' 

> 1 


’O' 

2 

1 “ 3 . 



2 

-A 





The two rows of the coefficient matrix being linearly dependent, as we would expect in 
view of (11-14), there is an infinite number of solutions, which can be expressed by the 
equation *i = 2x2. To force out a unique solution, we normalize the solution by imposing 
the restriction x] + x\ = 1T Then, since 

x? + x\ = (2x 2 ) 2 1 x| = 5x| = 1 


we can obtain (by taking the positive square root) x?. 
Thus the first characteristic vector is 


2/71 


1 



1/75, and also = 2xz = 2/73. 


n 

T More generally, for the n»variable case, we require that ^ xf = 1. 
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Similarly, by using the second root = -2 in (11.13'), we get the equation 


'2-(-2) 

2 

*1 


'4 l 

*1 


'O' 

2 


X2 m 


2 T_ 

/ l m 


_0. 


which has the solution x\ = Upon normalization, we find 

*? + X2 = (-Z*z) + *f = f x 2 = 1 

which yields xi = 2/V5 and x-\ = -1/%/5. Thus the second characteristic vector is 


v 2 = 


-1/VT 

2/s/5 


The set of characteristic vectors obtained in this manner possesses two important prop¬ 
erties: First, the scalar product i)v •/ (/ = 1,2,..., n) must be equal to unity, since 


v'vi = [x\ x 2 ••• x n ] 


X\ 
X2 





[by normalization] 


Second, the scalar product v^Vj (where i ^ /) can always be taken to be zero/ In sum, 
therefore, we may write that 


i>;v/=l and v\vj = 0 (/ 1 j) (11.15) 

These properties will prove useful later (see Example 6). As a matter of terminology, when 
two vectors yield a zero-valued scalar product, the vectors arc said to be orthogonal 
(perpendicular) to each other/ Hence each pair of characteristic vectors of matrix D must 
be orthogonal. The other property, = 1, is indicative of normalization. Together, these 
two properties account for the fact that the characteristic vectors (ui,..., v n ) arc said to 


t To demonstrate this, we note that, by (11.13), we may write Dv f = r f v f , and Dv, =r,v t . By 
premultiplying both sides of each of these equations by an appropriate row vector, we have 

vjDvj = V/f/Vy = rjtfvj [fj is a scalar] 

v] Dv, - vj nvi = riVjVj = nV'Vj [v}vj = v]v{] 

Since vJDvy and vjDv} are both 1x1, and since they are transposes of each other (recall that O' = D 
because D is symmetric), they must represent the same scalar. It follows that the extreme-right 
expressions in these two equations are equal; hence, by subtracting, we have 

(C - r;)^ = 0 

Now if r f ^ r, (distinct roots), then v' i v j has to be zero in order for the equation to hold, and this 
establishes our claim. If r, = r, (repeated roots), moreover, it will always be possible, as it turns out, to 
find two linearly independent normalized vectors satisfying v'vj = 0. Thus, we may state in general 
that v[vj s 0, whenever / ^ /. 

* As a simple illustration of this, think of the two unit vectors of a 2-space, = 

These vectors lie, respectively, on the two axes, and are thus perpendicular. At 
find that e,'e 2 = efa = 0. See also Exercise 4.3-4. 


J]ande 2 = [?]. 
e same time, we do 
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bu a set of orthonormal vectors. You should try to verily the orthonormality of the two 
characteristic vectors found in Example 5. 

Now wc arc ready to explain how the characteristic roots and characteristic vectors of 
matrix Wean be of service in determining the sign definiteness of the quadratic form u Du. 
In essence, the idea is again to transform u'Ou (which involves not only squared terms 
uj,..,, u‘ r but also cross-product terms such as ii\ii 2 and 11211 :,) into a form that contains 
only squared terms. Thus the approach is similar in intent to Ihe completing-the-squarc 
process used before in deriving the dcterminantal test. However, in the present case, the 
transformation possesses the additional feature that each squared term has as its coefficient 
one of the characteristic roots, so that the signs of the n roots will provide sufficient infor¬ 
mation for determining the sign definiteness of the quadratic form. 

The transformation that will do the trick is as follows. Let the characteristic vectors 
0 ]...., u„ constitute the columns of a matrix T: 

T =[t’i vi ••• v„] 

[JlXn] 

and then apply the transformation u = 7’ y to the quadratic form u 1 Du: 

[nxn) in *.n 

u’Du = {Ty) , D{Tv)=y ! T'DTy [by (4.11)] 

= /Ry where R = T'DT 

As a result, the original quadratic form in the variables u t is now turned into another qua¬ 
dratic form in the variables . Since the u ( variables and the vv variables lake the same 
range of values, the transformation does not affect the sign definiteness of the quadratic 
form. Thus we may now just as well consider the sign of the quadratic form yRy instead. 
What makes this latter quadratic form intriguing is that the matrix R will tum out to be a 
diagonal one. with the roots r \<..,, r n of matrix D displayed along its diagonal, and with 
zeros everywhere else, so that we have in fact 

u Du = y Ry — [vi yi • • • v„] 

= r\A + nA + ■ - ■ + r n y] (11.16) 

which is an expression involving squared terms only. The transformation R = T'DT 
provides us, therefore, with a procedure tor i/icigonaiizing the symmetric matrix D into the 
special diagonal matrix R. 



Example 6 


Verify that the matrix 


T'wMIJ’ VI IVIV kllV I 1 • V S' J _| *• 

.0 °] = [o -°]' 0n,heba 

transformation matrix T should be 


given in Example 5 can be diagonalized into the matrix 


= p ^ the basis of the characteristic vectors found in Example 5, the 


2/V5 —1/v/5’ 

1/v/5 2/v / 5 


T = [V] v 2 ] 
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Thus we may write 

R = T'DT = 


2 1 

7 ! 7 ! 

1 2 

7 ! 7 ! 



'2 1 



'2 2 


V3 ~V5 


3 O' 

2 -1 


1 2 

~~ i 

0 -2 


Vs Vs 




which duly verifies the diagonalization process. 


To prove the diagonalization result in (11.16). let us (partially) write out the matrix R as 
follows: 


R = T'DT 


i 

Vy 


D[V] v 2 


'« -I 


Vn] 


We may easily verify that/2[t,'i v 2 ••• u„] can be rewritten as [Dv\ Dv 2 ■■■ Dv„], 
Besides, by (11,13), we can further rewrite this as [n, r 2 v 2 ■ ■ ■ r„v„). Hence, we see 

that 


R = 

'"i 

1 '2 

i/ifi 

r 2 v 2 ' ■ 


r\v\v\ 

r 2 v\v2 • 

r 2 v 2 v 2 * 

•• r„v\v„ 

■■ r„v' 2 v„ 







r 2U>2 • 

•• r,,v' n v„ 


r, 0 • • • 0 

0 ri 0 


[by (11.15)] 


0 0 • - • r„ 


which is precisely what we intended to show, 

In view of the result in (11.16), we may formally state the characteristic-root test for the 
sign definiteness of a quadratic form as follows; 

1. q — u' Du is positive (negative) definite, if and only if every characteristic root of ZD is 
positive (negative). 

2. q = u' Du is positive (negative) semidefinite, if and only if all characteristic roots of D 
are nonnegativc (nonpositive). 

3. (j = u Du is indefinite, if and only if some of the characteristic roots of D arc positive 
and some arc negative. 

Note that, in applying this test, all wc need are the characteristic roots; the characteristic 
vectors are not required unless we wish to find the transformation matrix T. Note, also, that 
this test, unlike the determinantal lest previously outlined, permits us to check the second- 
order necessary conditions (part 2 of the test) simultaneously with the sufficient conditions 
(part 1 of the test). However, it does have a drawback. When the matrix D is of a high di¬ 
mension, the polynomial equation (11.14) may not be easily solvable for the characteristic 
roots needed for the test. In such cases, the determinantal test might yet be preferable. 
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EXERCISE 11.3 


1. By direct matrix multiplication, express each of the following matrix products as a 
quadratic form: 


(o) [u v] 


4 2 
2 3 


u 

v 


(b) [u v] J J J 


u 

V 


(C) [X y] 


(d)[dx dy] 


'5 

2' 

V 

4 

0 

.y. 

4 [" 

L r K* 

fxy 

fyy 


dx 

dy 


2. In Prob.lA and c, the coefficient matrices are not symmetric with respect to the prin¬ 
cipal diagonal. Verify that by averaging the off-diagonal elements and thus converting 

we will get the same quadratic forms 

as before. 

3. On the basis of their coefficient matrices (the symmetric versions), determine by the 
determinantal test whether the quadratic forms in Prob.1 a, b, and c are either positive 
definite or negative definite. 

4. Express each of the following quadratic forms as a matrix product involving a symmet¬ 
ric coefficient matrix: 

(a) q = 3u 2 - 4uv + 7v 2 (d) q = 6xy- Sy 2 - 2x 2 

(b) q = u 2 - 7uv + 3v 2 (e) q = 3i^ - 2tq u 2 + 4<q Uj + 5u^ + 4u| - 2u 2 u } 

(c) q = 8uv- u 2 - 31 v 2 (0 q= -u 2 + 4uv - 6 uw - 4v 2 -7w 2 

5. From the discriminants obtained from the symmetric coefficient matrices of Prob. 4, 
ascertain by the determinantal test which of the quadratic forms are positive definite 
and which are negative definite. 

6. Find the characteristic roots of each of the following matrices: 


them, respectively, into 


-2 2 
2 -4 


and 


(a) D = 


4 2' 
2 3 



(c) F = 


5 3 
3 0 


What can you conclude about the signs of the quadratic forms u'Du, u'Eu, and u'Ful 
(Check your results against Prob. 3.) 

F4 2" 

7. Find the characteristic vectors of the matrix ^ -| • 

8. Given a quadratic form u'Du, where D is 2x2, the characteristic equation of D can be 
written as 


dn - r di2 
d22-r 


(d u = d 2 1) 


Expand the determinant; express the roots of this equation by use of the quadratic 
formula; and deduce the following: 

( a ) No imaginary number (a number involving -/-T) can occur in ri and r 2 . 

(b) To have repeated roots, matrix D must be in the form of ^ ® . 

(c) To have either positive or negative semidefiniteness, the discriminant of the qua¬ 
dratic form may vanish, that is, |D| = 0 is possible. 
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11.4 Objective Functions with More than Two Variables 

When there appear in an objective function n >2 choice variables, it is no longer possible to 
graph the function, although we can still speak of a hypersurface in an (n + l)-dimensional 
space. On such a (nongraphable) hypersurface, there again may exist (n + l)-dimensional 
analogs of peaks of domes and bottoms of bowls. How do we identify them? 

First-Order Condition for Extremum 

Let us specifically consider a function of three choice variables. 

Z = f(XuX 2 ,Xi) 

with first partial derivatives f\, ,/i.and f \ and second partial derivatives f t (= d 2 zjdx,dx,). 
with i,j = 1,2,3. By virtue of Young's theorem, we have f, = j),. 

Our earlier discussion suggests that, to have a maximum or a minimum of z, it is neces¬ 
sary that dz — 0 for arbitrary values oftfx], dx 2 , and dxy, not all zero. Since the value of 
dz is now 


dz = / dx\ + fi dxi + h dx* ( 11 . 17 ) 

and since dx\, dx 2i and dx 3 are arbitrary changes in the independent variables, not all zero, 
the only way to guarantee a zero dz is to have f\ - f 2 = f, ~ 0. Thus, again, the necessary 
condition for extremum is that all the first-order partial derivatives be zero, the same as lor 
the two-variable case/ 

Second-Order Condition 

The satisfaction of the first-order condition earmarks certain values of z as the stationary 
values of the objective function. If at a stationary value of r wc find that d 2 z is positive def¬ 
inite, this will suffice to establish that value ofz as a minimum. Analogously, the negative 
definiteness of d 2 z is a sufficient condition for the stationary value to be a maximum. This 
raises the questions of how to express d 2 z when there are three variables in the function and 
how to determine its positive or negative definiteness. 

The expression for d 2 z can be obtained by differentiating dz in (11.17). In such a 
process, as in (11.6), wc should treat the derivatives f as variables and the differentials dx, 


* As a special case, note that if we happen to be working with a function z=f(*i, * 2 , * 3 ) implicitly 
defined by an equation F(z, /n, * 2 , * 3 ) = 0, where 


dz —tiF/dXi 
dx,- ~ i)fjhz 


(' = U, 3) 


then the first-order condition f\ = f 2 = f-$= 0 will amount to the condition 

ft F <)F BF 

<*X\ i)X2 


since the value of the denominator df/Bz^Q makes no difference. 
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as constants. Thus we have 


d 2 z = d(dz) = 


Hdz) 

i)vi 

a 


dx i 


!)(dz' 

<)x 2 


dx 


d(dz 

dX\ 


dx- 


= 7—(/i + /2 *2 + fi dxi) dx\ 

uX\ 

+ 7 —(h dx\ + fi dx 2 + ,/) dx 3) 

()X 2 ' 

H-{/, dx, + fi dx 2 + /3 dxi) dx 2 

dxs 

= /n^.x^ + f 2 dx\ dx 2 +fsdx] dx} 

+ fa dx 2 dx 1 + /22 dx\ + /y f/x2 ^3 
+ _/•] dxy dx 1 + /i2 dx-} dx 2 + hi dxy 


( 11 . 18 ) 


which is a quadratic form similar to (11.12). Consequently, the criteria for positive and neg¬ 
ative definiteness we learned earlier arc directly applicable here. 

In determining the positive or negative definiteness of d 2 z, we must again, as we did in 
(11.6'), regard dx, as variables that can take any values (though not all 7ero), while consid¬ 
ering the derivatives /, ; y as coefficients upon which to impose certain restrictions. The 
coefficients in (11.18) give rise to the symmetric Hessian determinant 


H I = 


fa 

h\ 

fa 


f\2 f\i 
fa fa 
fa fa 


whose leading principal minors may be denoted by 

//,! = /„ I// 2 I = 1/11 In 


Hi\ = \H\ 


fa fa 

Thus, on the basis of the determinantal criteria for positive and negative definiteness, we 
may state the second-order sufficient condition for an extremum ofz as follows: 


z* is a 


if 


( 11 . 19 ) 


( maximum 
minimum 

\H 2 \ > 0: |#j|<0 (d 2 z negative definite) 

H\ | > 0; \Hi\ > 0; |W 3 | > 0 {d 2 z positive definite) 

In using this condition, wc must evaluate all the leading principal minors at the stationary 
point where /1 — h = f% = 0. 

We may. of course, also apply the characteristic-root test and associate the positive defi¬ 
niteness (negative definiteness) of d 2 z with the 

f/n /12 fa' 

roots of the Hessian matrix fa J 2 2 J22 

_fa .hi fa. 

order total differential d 2 z is positive (negative) definite, it is also acceptable to state that 
the Hessian matrix H (to be distinguished from the Hessian determinant \H\) is positive 
(negative) definite. In this usage, however, note that the sign definiteness of H refers to the 


positivity (negativity) of all the characteristic 
. In fact, instead of saying that the second- 



Chapter 11 The Case of More than One Choice Variable 315 


Example 1 


Example 2 


sign of the quadratic form d l z with which H is associated, not to the signs of the elements 
of H per se. 

Find the extreme value(s) of 

z = 2x\ + x 1 x 2 + 4x| + Xi x 3 + x| + 2 

The first-order condition for extremum involves the simultaneous satisfaction of the follow¬ 
ing three equations: 

(f, =)4xi + x 2 + Xi = 0 
( 6 -) * 1 + 8*2 =0 
(h -) *i +2 x 3 =0 

Because this is a homogeneous linear-equation system, in which all the three equations are 
independent (the determinant of the coefficient matrix does not vanish), there exists only the 
Single solution x* = = xj = 0. This means that there is only one stationary value, z‘ = 2. 

The Hessian determinant of this function is 



fv 

fl2 

fa 


4 

1 1 

H| = 

h 1 

fn 

fa 

= 

1 

8 0 


61 

hi 

hi; 


1 

0 2 


whose leading principal minors are all positive: 

|Hi|=4 |H 2 | - 31 IH 3 I = 54 

Thus we can conclude, by (11.9), that z* = 2 is a minimum. 


Find the extreme value(s) of 

z = -x? + 3 xi X3 + 2x2 - *1 - 3 x| 

The first partial derivatives are found to be 

fi = — 3 x^ + 3 xj f 2 = 2 — 2x 2 fi = 3 xi — 6x3 

By setting all f, equal to zero, we get three simultaneous equations, one nonlinear and two 
linear: 

- 3 x* + 3x 3 = 0 

- 2x 2 = -2 

3 xi - 6x3 — 0 


Since the second equation gives x$ = 1 and the third equation implies xf = 2xJ, substitu 
tion of these into the first equation yields two solutions: 


(*7, x 2 ' * 3 ) — 


(0,1,0), implying z* = 1 
Z>M), implyingz* = 

The second-order partial derivatives, properly arranged, give us the Hessian 


H 


6 x 1 0 3 

0-2 0 
3 0-6 
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in which the first element (- 6 x 1 ) reduces to 0 under the first solution (with xf = 0) and to 
-3 under the second (with x* = 1). It is immediately obvious that the first solution does 
not satisfy the second-order sufficient condition, since [tfil = 0. We may, however, resort to 
the characteristic-root test for further information. For this purpose, we apply the charac¬ 
teristic equation (11.1 -4). Since the quadratic form being tested is d 2 z, whose discriminant 
is the Hessian determinant, we should, of course, substitute the elements of the Hessian for 
the dif elements in that equation. Hence the characteristic equation is (for the first solution) 

-r 0 3 

0 -2 — r 0 =0 

3 0 -6 -r 

which, upon expansion, becomes the cubic equation 

r 3 + 8r 2 f 3r-18 = 0 

Using Theorem I in Sec, 3.3, we are able to find an integer root -2. Thus the cubic function 
should be divisible by (r +2), and we can factor the cubic function and rewrite the pre¬ 
ceding equation as 

(r + 2)(r 2 + 6r - 9) = 0 

It is clear from the (r + 2) term that one of the characteristic roots is n = -2. The other two 
roots can be found by applying the quadratic formula to the other term; they are 
r 2 = -3 + \ 4 T 2 , and r 3 = -3 - \-JJl. Inasmuch as ri and r 3 are negative but r 2 is posi¬ 
tive, the quadratic form d 2 z is indefinite, thereby violating the second-order necessary 
conditions for both a maximum and a minimum z. Thus the first solution (z f = 1) is not an 
extremum at all. 

As for the second solution, the situation is simpler. Since the leading principal minors 

\H ] \ = 3 \H 2 ) = 6 and JH 3 J = -18 

duly alternate in sign, the determinantal test is conclusive. According to (11.19), the solu¬ 
tion z* = is a maximum. 

n-Variable Case 

When there are n choice variables, the objective function may be expressed as 

z = f{x 1 , * 2 , 

The total differential will then be 

dz = f\ dx\ + j' 2 dx 2 + ■ ■ ■ + f n dx„ 

so that the necessary condition for extremum (dz = 0 for arbitrary dx-, , not all zero) means 
that all the n first-order partial derivatives are required to be zero. 

The second-order differential d 2 z will again be a quadratic form, derivable analogously 
to (11.18) and expressible by an n x n array. The coefficients of that array, properly 
arranged, will now give the (symmetric) Hessian 

/I] /l2 • • • /in 
h\ hi • • • jin 


Hi Jn2 ' ' * Jnt{ 
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TABLE 11.2 

Determinantal 

Condition 

Maximum 


Minimum 


Test for 

First-order necessary condition 

fl= f 2 = -‘ = fn-0 

fi = 


= 0 

Relative 

Extremum: z = 

Second-order sufficient 
condition* 

|Hi | < 0; \Hi\ > 0; 

|H,|, 


>0 


f Applicable on|v after the rtrst*order nwsswy condition has been satisfied. 





with leading principal minors \H\ |, \l} 2 \ r ..., \H„\, as defined before. The second-order suf¬ 
ficient condition for extremum is, as before, that ail those principal minors be positive (for a 
minimum in z) and that they duly alternate in sign (for a maximum in z), the first one being 
negative. 

In summary, then- if we concentrate on the determinants! test - we have the criteria as 
listed in Table 11.2, which is valid for an objective function of any number of choice vari¬ 
ables. As special cases, we can have n = 1 orn = 2. When n = l, the objective function is 
z = j(x), and the conditions for maximization, f\ = 0 and |W||<(), reduce to 
/'(a) = 0 and f"(x) < 0, exactly as we learned in Sec. 9.4. Similarly, when n = 2. the 
objective function is ; = f(x\,x 2 ). so that the first-order condition for maximum is 
/j = ji = 0, whereas the second-order sufficient condition becomes 


/I i <0 and 


/,, fi 
,/ii fn 


fuhi ~ J\2 > 0 


which is merely a restatement of the information presented in Table 11,1, 


EXERCISE 11.4 

Find the extreme values, if any, of the following four functions. Check whether they are 
maxima or minima by the determinantal test. 

1. i = x\ + 3*| - 3*i x 2 + 4*2*3 + 6*f 

2. z=29-(J-4 + 4) 

3 . Z = X] *3 + X] - *2 + * 2*3 + *| + 3*3 

4. 2= e 2* + e-r+e” 2 - (2* + 2e w - y) 

Then answer the following questions regarding Hessian matrices and their characteristic 
roots. 

5. (a) Which of Probs. 1 through 4 yield diagonal Hessian matrices? In each such case, do 

the diagonal elements possess a uniform sign? 

(. b ) What can you conclude about the characteristic roots of each diagonal Hessian 
matrix found? About the sign definiteness of d 2 zl 
(e) Do the results of the characteristic-root test check with those of the determinantal 
test? 

6. (a) Find the characteristic roots of the Hessian matrix for Prob. 3. 

(b) What can you conclude from your results? 

(c) Is your answer to ( b) consistent with the result of the determinantal test for Prob. 3 ? 
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11.5 Second-Order Conditions 

in Relation to Concavity and Convexity __ 

Second-order conditions —whether stated in terms of the principal minors of the Hessian 
determinant or the characteristic roots of the Hessian matrix—arc always concerned with 
the question of whether a stationary point is die peak of a hill or the bottom of a valley. In 
other words, they relate to how a curve, surface, or hypersurfaee (as the case may be) bends 
itself around a stationary point. In the single-choice-variable case, with z = /(*}, the hill 
(valley) configuration is manifest in an inverse (U-shaped) curve. For the two-variable 
function z = f(x,y), the hill (valley) configuration takes the form of a dome-shaped 
(bowl-shaped) surface, as illustrated in Fig. 11 la (Fig. 11 lb). When three or more choice 
variables are present, the hills and valleys are no longer graphable, but wc may nevertheless 
think of “hills” and “valleys" on hypersurfaces. 

A function that gives rise to a hill (valley) over the entire domain is said to be a concave 
(convex) function. 1 ’ For the present discussion, we shall take the domain to be the entire R", 
where n is the number of choice variables. Inasmuch as the hill and valley characterizations 
refer to the entire domain, concavity and convexity are. of course, global concepts. For a finer 
classification, we may also distinguish between concavity and convexity on the one hand, and 
strict concavity and strict convexity on the other hand. In the «o«strict case, the hill or val Icy 
is allowed to contain one or more flat (as against curved) portions, such as line segments (on 
a curve) or plane segments (on a surface). The presence ofthe word strict, however, rules out 
such line or plane segments. The two surfaces shown in Fig. 11.2 illustrate strictly concave 
and strictly convex functions, respectively. The curve in Fig. 6.5. on the other hand, is convex 
(it shows a valley) but not strictly convex (it contains line segments). A strictly concave 
(strictly convex) function must be concave (convex), but the converse is not true. 

In view of the association of concavity and strict concavity with a global hill configura¬ 
tion, an extremum of a concave function must be a peak—a maximum (as against 
minimum). Moreover, that maximum must be an absolute maximum (as against relative 
maximum), since the hill covers the entire domain. However, that absolute maximum may 
not be unique, because multiple maxima may occur if the hill contains a flat horizontal top. 
The latter possibility can be dismissed only when we specify strict concavity. For only then 
will the peak consist of a single point and the absolute maximum be unique. A unique 
(nonunique) absolute maximum is also referred to as a strong (weak) absolute maximum. 

By analogous reasoning, an extremum of a convex function must be an absolute (or 
global) minimum, which may not be unique. But an extremum of z strictly convex function 
must be a unique absolute minimum. 

In the preceding paragraphs, the properties of concavity and convexity are taken to be 
global in scope. If they are valid only for a portion of the curve or surface (only on a sub¬ 
set S ofthe domain), then the associated maximum and minimum are relative (or local) to 
that subset ofthe domain, since we cannot be certain ofthe situation outside of subset S’. In 
our earlier discussion of the sign definiteness of d 2 z (or of the Hessian matrix H), we eval¬ 
uated the leading principal minors of the Hessian determinant only at the stationary point. 
By thus limiting the verification of the hill or valley configuration to a small neighborhood 
of the stationary point, we could discuss only relative maxima and minima. But it may 


t If the hill (valley) pertains only to a subset S of the domain, the function is said to be concave 
(convex) on S. 



Chapter 11 The Case of More than One Choice Variable 319 


FIGURE 11.5 


happen that d 2 z has a definite sign everywhere, regardless of where the leading principal 
minors are evaluated. In that event, the hill or valley would cover the entire domain, and the 
maximum or minimum found would be absolute in nature. More specifically, if d 2 z is 

everywhere negative (positive) semidefinite, the function z = /'(x,,. x„) must be 

concave (convex), and if d 2 z is everywhere negative (positive) definite, the function f must 
be strictly concave (strictly convex). 

The preceding discussion is summarized in Fig. 11.5 for a twice continuously differen¬ 
tiable function 2 = j\x i,xi _, x„). For clarity, wc concentrate exclusively on concavity 

and maximum; however, the relationships depicted will remain valid if the words concave, 
negative, and maximum are replaced, respectively, by convex, positive , and minimum. To read 
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Fig. 11.5, recall that the symbol (here elongated and even bent) means “implies" When 
that symbol extends from one enclosure (say, a rectangle) to another (say, an oval), it means 
that the former implies (is sufficient for) the latter; it also means that the latter is necessary 
for the former. And when the =* symbol extends from one enclosure through a second to a 
third, it means that the first enclosure, when accompanied by the second, implies the third. 

In this light, the middle column in Fig. 11.5, read from top to bottom, states that the first- 
order condition is necessary for z* to be a relative maximum, and the relative-maximuni 
status of z f is. in turn, necessary for z' to be an absolute maximum, and so on. Alterna¬ 
tively, reading that column from bottom to top, we see that the fact that z* is a unique 
absolute maximum is sufficient to establish z* as an absolute maximum, and the absolute- 
maximum status of z* is, in turn, sufficient for z* to be a relative maximum, and so forth. 
The three ovals at the top have to do with the first- and second-order conditions at the sta¬ 
tionary point -‘. Hence they relate only to a relative maximum. The diamonds and triangles 
in the lower part, on the other hand, describe global properties that enable us to draw con¬ 
clusions about an absolute maximum. Note that while our earlier discussion indicated only 
that the everywhere negative semidefiniteness of d 2 z is sufficient for the concavity of func¬ 
tion /, we have added in Fig. 11.5 the information that the condition is necessary, too. In 
contrast, the stronger property of everywhere negative definiteness of d 2 z is sufficient, but 
not necessary, for the strict concavity of/—because strict concavity of/is compatible with 
a zero value of ii 2 z at a stationary point. 

The most important message conveyed by Fig. 11.5, however, lies in the two extended => 
symbols passing through the two diamonds. The one on the left states that, given a concave 
objective function, any stationary point can immediately be identified as an absolute max¬ 
imum. Proceeding a step further, we see that the one on the right indicates that if the 
objective function is stricilv concave, the stationary point must in fact be a unique absolute 
maximum. In either case, once the first-order condition is met, concavity or strict concav¬ 
ity effectively replaces the second-order condition as a sufficient condition for maximum - 
nay, for an absolute maximum, The powerfulness of this new' sufficient condition becomes 
clear when we recall that d 2 z can happen to be zero at a peak, causing the second-order 
sufficient condition to fail. Concavity or strict concavity, however, can take care of even 
such troublesome peaks, because it guarantees that a higher-order sufficient condition is 
satisfied even if the second-order one is not. It is for this reason that economists often as¬ 
sume concavity from the very outset when a maximization model is to be formulated with 
a general objective function (and, similarly, convexity is often assumed for a minimization 
model). For then all one needs to do is to apply the first-order condition. Note, however, that 
if a specific objective function is used, the property of concavity or convexity can no longer 
simply be assumed. Rather, it must be checked. 

Checking Concavity and Convexity 

Concavity and convexity, strict or nonstrict, can be defined (and checked) in several ways. 
We shall first introduce a geometric definition of concavity and convexity for a two-variable 
function z = f(x\, x^), similar to the one-variable version discussed in Sec. 9.3: 

The function z = j\x i, xi) is concave (convex) iff, for any pair of distinct points M and V on 

its graph—a surface— line segment AAV lies either on or below (above) the surface, The func¬ 
tion is strictly concave (strictly convex) iff line segment AAV lies entirely below (above) the 

surface, except at M and N. 
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FIGURE 11.6 



The case of a strictly concave function is illustrated in Fig, 11.6, where M and /V, two arbi¬ 
trary points on the surface* are joined together by a broken line segment as well as a solid 
arc, with the latter consisting of points on the surface that lie directly above the line seg¬ 
ment. Since strict concavity requires line segment MN to lie entirely below arc MN (except 
at M and N) for any pair of points M and AT, the surface must typically be dome-shaped. 
Analogously, the surface of a strictly convex function must typically be bowl-shaped. As 
for (nonstrictly) concave and convex functions, since line segment MN is allowed to lie on 
the surface itself, some portion of the surface, or even the entire surface, may be a plane— 
flat, rather than curved 

To facilitate generalization to the nongraphablc n-dimensional case, the geometric defi¬ 
nition needs to be translated into an equivalent algebraic version. Returning to Fig. 11.6, let 
u = (u i, ui) and v = (i?i, V 2 ) be any two distinct ordered pairs (2-vectors) in the domain 
of 2 = /( x\ i X 2 ). Then the 2 values (height of surface) corresponding to these will be 
f(u) = /(« 1 , ui) and f(v) = f(v 1, is), respectively. We have assumed that the variables 
can take all real values, so if u and v are in the domain, then all the points on the line seg¬ 
ment uv are also in the domain, Now each point on the said line segment is in the nature of 
a “weighted average” of u and v. Thus we can denote this line segment by $u + (1 - 0)i\ 
where 6 (the Greek letter theta)—unlike u and v —is a (variable) scalar with the range of 
values 0 < 0 < 1.* By the same token, line segment AW, representing the set of all 
weighted averages of f(u) and f(v), can be expressed by Of(u) + (I - 0 ) f(v), with 0 
again varying from 0 to I. What about arc MN along the surface? Since that arc shows the 

T The weighted-average expression 6u t (1 -b)v, for any specific value of 0 between 0 and 1, is 
technically known as a convex combination of the two vectors u and v. Leaving a more detailed 
explanation of this to a later point of this section, we may note here that when 0 = 0, the given 
expression reduces to vector v and similarly that when b = 1, the expression reduces to vector u. An 
intermediate value of b, on the other hand, gives us an average of the two vectors u and v. 
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values of the function / evaluated at the various points on line segment u i\ il can be written 
simply as f[9u + (1 - 6)v]. Using these expressions, we may now state the following 
algebraic definition: 


A function /is 
for 0 < 8 < 1, 


concave 

convex 


iff, for any pair of distinct points u and v in the domain of f. and 


0/00 + 0 -S)f(v) : f[9u + (\-9)v) (11.20) 


height of line 


height C»f arc 


Note that, in order to exclude the two end points M and N from the height comparison, we 
have restricted 8 to the open interval (0, 1) only. 

This definition is easily adaptable to strict concavity and convexity by changing the 
weak inequalities < and > to the strict inequalities < and >, respectively. The advantage of 
the algebraic definition is that it can be applied to a function of any number of variables, for 
the vectors u and v in the definition can very well be interpreted as n-vectors instead of 
2-vectors. 

From (11.20), the following three theorems on concavity and convexity can be deduced 
fairly easily. These will be stated in terms of functions f(x) and g(x), but x can be inter¬ 
preted as a vector of variables; that is, the theorems are valid for functions of any number 
of variables. 

Theorem I (linear function) If f{x) is a linear function, then it is a concave function as 
well as a convex function, but not strictly so. 

Theorem II (negative of a function) If f(x) is a concave function, then -/(.r) is a 
convex function, and vice versa. Similarly if f(x) is a strictly concave function, then 
-f(x) is a strictly convex function, and vice versa. 

Theorem III (sum of functions) If f(x) and g(x) are both concave (convex) functions, 
then f(x) g{x) is also a concave (convex) function. If fix) and g(x) are both concave 
(convex) and, in addition, either one or both of them arc strictly concave (strictly convex), 
then f(x) +£(*) is strictly concave (strictly convex). 

Theorem I follows from the fact that a linear function plots as a straight line, plane, or hy- 
perplanc, so that “line segment MAT always coincides with ‘"arc MN” Consequently, the 
equality part of the two weak inequalities in {11.20) are simultaneously satisfied, making the 
function qualify as both concave and convex. However, since it fails the strict-inequality 
part of the definition, the linear function is neither strictly concave nor strictly convex. 

Underlying Theorem II is the fact that the definitions of concavity and convexity differ 
only in the sense of inequality. Suppose that f(x) is concave; then 

e/(«) + O-0)AW </[»« + (!-9)«] 

Multiplying through by -1, and duly reversing the sense of the inequality, we get 

#[-/(«)] + 0 - 0)[~/'IT)] > ~f[0u + (1 - 0)«] 

This, however, is precisely the condition for - f(x) to be convex. Thus the theorem 
is proved for the concave f(x) case. The geometric interpretation of this result is very 
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simple: the mirror image ofa hill with reference to the base plane or hyperplane is a valley. 
The opposite case can be proved similarly. 

To see the reason behind Theorem 111, suppose that /(x) and g(x) are both concave. 
Then the following two inequalities hold: 

Of(u) + (1 - 3) m < f\$U + (1 - 0)wj (11.21) 

Og(u) T (1 - %(») < g[du + (1 - 0)v] (11.22) 

Adding these, we obtain a new inequality 

#[/(“) +s(m)] + (1 - 3)[/"(l') + £0-')] 

< f[Bu + (1 - 0)u] + H [0u + (1 - 0)«] (11.23) 

But this is precisely the condition for [/(x) 4- g(x)] to be concave. Thus the theorem is 
proved for the concave case. The proof for the convex case is similar. 

Moving to the second part of Theorem 111, let fix) be strictly concave. Then (11.21) be¬ 
comes a strict inequality: 

0/'(«)+(l -0)/{v) </[0« + (l -0)v] (11.2V) 

Adding this to (11.22). we find the sum of the left-side expressions in these two inequali¬ 
ties to be strictly less than the sum of the right-side expressions, regardless of whether 
the < sign or the = sign holds in (11.22). This means that (11.23) now becomes a strict 
inequality, too, thereby making [./(x) + g(x)] strictly concave. Besides, the same conclu¬ 
sion emerges a fortiori, ifg(x) is made strictly concave along with fix), that is, if (11.22) 
is converted into a strict inequality along with (11.21), This proves the second part of the 
theorem for the concave case. The proof for the convex case is similar. 

This theorem, winch is also valid for a sum of more than two concave (convex) func¬ 
tions, may prove useful sometimes because it makes possible the compartmcntalbation 
of the task of checking concavity or convexity of a function that consists of additive terms. 
If the additive terms are found to be individually concave (convex), that would be sufficient 
for the sum function to be concave (convex). 


Example 1 Check z=x * + x 2 for concavity or convexity. To apply (11.20), let u = (ui,u 2 ) and 
- v = (vi, v 2 ) be any two distinct points in the domain. Then we have 

/(U)= f(Ui,U 2 ) = U? + U2 
f(v)= f(V], v 2 ) = vf + v\ 


and 


f[0u + (1 -0)v]= f 


Ou 1 +(1 -6)v 1 , 0 U 2 + O - 0) v 2 


value of X] value of x; 


= [PUi +(1 -0>i] 2 + [0U2 + (1 - 0 ) v 2 ] 2 

Substituting these into (11.20), subtracting the right-side expression from the left-side one, 
and collecting terms, we find their difference to be 

0(1 -0)(wf+ i^)+0(1 -6)(vf + v£)-200 -9)(uM+U2V 3 ) 

= 3(1 - 3)[(ui - Vi) 2 + (u 2 - V 2 ) 2 ] 
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Example 2 


Example 3 


Since 6 is a positive fraction, 6(1 - <>) must be positive. Moreover, since (ui, u 2 ) and (vi, v 2 ) 
are distinct points, so that either u-\ vi or u 2 ± v 2 (or both), the bracketed expression 
must also be positive. Thus the strict > inequality holds in (11 .20), and z= x} + xj is strictly 
convex. 

Alternatively, we may check the xf and x\ terms separately. Since each of them is indi¬ 
vidually strictly convex, their sum is also strictly convex. 

Because this function is strictly convex, it possesses a unique absolute minimum. It is easy 
to verify that the said minimum is i * = 0, attained at xj = x\ = 0, and that it is indeed 
absolute and unique because any ordered pair (xi,x 2 )^(0,0) yields a z value greater 
than zero. 

Check z= -x\ - xj for concavity or convexity. This function is the negative of the function 
in Example 1. Thus, by Theorem II, it is strictly concave. 

Check z=(x + y) 2 for concavity or convexity. Even though the variables are denoted by x 
and y instead of x\ and x 2 , we can stiff let u = (ui, u 2 ) and v = (v|, v 2 ) denote two distinct 
points in the domain, with the subscript / referring to the ith variable. Then we have 

f(u)= f(ui,u 2 ) = 0/i +u 2 ) 2 
f(v)= f(v,,v 2 ) = (v,+v 2 ) 2 

and f[9u + (1 -6)v] = [Aui +(1 -fl)vi 4 - 9u 2 + (1 -6)v 2 ] 2 

= [6(ui + u 2 ) + (1 - 6)(V! + v 2 )] 2 

Substituting these into (11.20), subtracting the right-side expression from the left-side one, 
and simplifying, we find their difference to be 

«(1 - fl)(ui + u 2 ) 2 - 26(1 - 0)(ui + u 2 )(vi + v 2 ) 4- 6(1 - 4 v 2 ) 2 

= <1(1 - 6)[(ui 4- u 2 ) - (vi + v 2 )] 2 

As in Example 1,0(1 - 6) is positive. The square of the bracketed expression is nonnegative 
(zero cannot be ruled out this time). Thus the > inequality holds in (11.20), and the func¬ 
tion (x + y) 2 is convex, though not strictly so. 

Accordingly, this function has an absolute minimum that may not be unique. It is easy to 
verify that the absolute minimum is z* = 0, attained whenever x* + y* = 0. That this is an 
absolute minimum is clear from the fact that whenever x + 0, z will be greater than 

z* = 0. That it is not unique follows from the fact that an infinite number of ( x *, /*), pairs 
can satisfy the condition x* + y* = 0. 


Differentiable Functions 

As staled in (11.20). (he definition of concavity and convexity uses no derivatives and thus 
does not require differentiability. If the function is differentiable, however, concavity and 
convexity can also be defined in terms of its first derivatives. In the one-variable ease, the 
definition is: 


A differentiable function/(x) is 
in the domain. 


concave 

convex 


iff. for any given point u and any other point i; 


/( v) 


/(w) + - W 


(11.24) 
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FIGURE 11.7 



Concavity and convexity will be strict, if'the weak inequalities in (11.24) are replaced by 
the strict inequalities < and >, respectively. Interpreted geometrically, this definition de¬ 
picts a concave (convex) curve as one that lies on or below (above) all its tangent lines. To 
qualify as a strictly concave (strictly convex) curve, on the other hand, the curve must lie 
strictly below (above) all the tangent lines, except at the points of tangency. 

In Fig. 11.7, let point A be any given point on the curve, with height f(u) and with tan¬ 
gent line AB. Let x increase from the value u. Then a strictly concave curve (as drawn) 
must, in order to form a hill, curl progressively away from the tangent line AB, so that point 
C, with height f{v), has to lie below point B. In this case, the slope of line segment AC is 
less than that of tangent AB. If the curve is «o«strictly concave, on the other hand, it may 
contain a line segment, so that, for instance, arc AC may turn into a line segment and be co¬ 
incident with line segment AB, as a linear portion of the curve. In the latter case the slope 
of AC is equal to that of AB. Together, these two situations imply that 

I Slope of line segment AC = ~ = ) ——— < (slope of AB =) Hu) 

\ AD J v-u 

When multiplied through by the positive quantity (v - u), this inequality yields the result 
in (11.24) for the concave function. The same result can be obtained, if we consider instead 
x values less than u. 

When there arc two or more independent variables, the definition needs a slight 
modification: 


A differentiable function /'(*) = f(x j,... ,.x„) is I concavc 

[ convex 

u = («i, — u„) and any other point v = (vi.,.., v n ) in the domain, 

(<*1 

m m ■ 

i j 

where fj(u) = 3//3jry is evaluated at u = (u i,., u n ). 


iff for any given point 


+ -u,) 


(11.24') 


This definition requires the graph of a concave (convex) function f(x) to liq on or below 
(above) all its tangent planes or hyperplanes. For strict concavity and convexity, the weak 
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Example 4 


inequalities in (11.24') should be changed to strict inequalities, which would require the 
graph of a strictly concave (strictly convex) function to lie strictly below (above) all its tan¬ 
gent planes or hyperplanes, except at the points of tangency. 

Finally, consider a function r = /(*,...., .<:„) which is twice continuously differen¬ 
tiable. For such a function, second-order partial derivatives exist, and thus d 2 z is defined. 
Concavity and convexity can then bo checked by the sign of d 2 z: 


A twice continuously differentiable function z - fix )>.... x„) is 


concave 

convex 


if. and only 


if. d 2 z is everywhere 


negative 

positive 


semidefinite. The said function is strictly 


(but not only if) d 2 z is everywhere 


negative 

positive 


definite, 


concave ... 

it 

convex | 

(11.25) 


You will recall that the concave and strictly concave aspects of (11.25) have already been 
incorporated into Fig. 11.5. 


Check z= -x . 4 for concavity or convexity by the derivative conditions. We first apply 
(11.24). The left- and right-side expressions in that inequality are in the present case -v 4 
and -u 4 - 4 u 3 (v - u), respectively. Subtracting the latter from the former, we find their 
difference to be 

( v 4 — a 4 \ 

—-—— + 4u 3 1 [factoring] 

= (v - u)[-(v 3 + v 2 u + vu 2 + u 3 ) + 4u 3 ] [by (7.2)] 

It would be nice if the bracketed expression turned out to be divisible by (v - u ), for then 
we could again factor out (v - u) and obtain a squared term (v - u) 2 to facilitate the eval¬ 
uation of sign. As it turns out, this is indeed the case. Thus the preceding difference equa¬ 
tion can be written as 

-(v - u) 2 [v2 + 2vu+ 3u 2 ] = ~(v- u) 2 [(v + u) 2 +2W 2 ] 

Given that v ± u, the sign of this expression must be negative. With the strict < inequality 
holding in (11.24), the function z - -x 4 is strictly concave. This means that it has a unique 
absolute maximum. As can be easily verified, that maximum is z* = 0, attained at x* = 0. 

Because this function is twice continuously differentiable, we may also apply (11.25). 
Since there is only one variable, (11.25) gives us 

d 2 z = f"{x) dx 2 = -12x 2 dx 2 [by (11.2)] 

We know that dx 2 is positive (only nonzero changes in x are being considered); but 
-12x 2 can be either negative or zero. Thus the best we can do is to conclude that d 2 z 
is everywhere negative sem/definite, and that z= -x 4 is (nonstrictly) concave. This con¬ 
clusion from (11.25) is obviously weaker than the one obtained earlier from (11.24); 
namely, z = -x 4 is strictly concave. What limits us to the weaker conclusion in this case is 
the same culprit that causes the second-derivative test to fail on occasions—the fact that d 2 z 
may take a zero value at a stationary point of a function known to be strictly concave, or 
strictly convex. This is why, of course, the negative (positive) definiteness of d 2 z is presented 
in (11.25) as only a sufficient, but not necessary, condition for strict concavity (strict 
convexity). 
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Example 5 Check z = x] + xj for concavity or convexity by the derivative conditions. This time we 

- have to use (11.24') instead of (11.24). With u = (to, m) and v = (v^, v 2 ) as any two points 

in the domain, the two sides of (11.24') are 

Left side = + v| 

Right side = u] + u\ -+ 2ui(v| - u^) + 2u2(v 2 - 112 ) 

Subtracting the latter from the former, and simplifying, we can express their difference as 

v, 2 -2 vw + u\ +v| - 2v 2 u 2 -u\ =(vi -Ui) 2 + (v 2 - u 2 ) 2 

Given that (vi, v 2 ) (ui, u 2 ), this difference is always positive. Thus the strict > inequality 
holds in (11.24'), and z = x 2 + x\ is strictly convex. Note that the present result merely 
reaffirms what we have previously found in Example 1. 

As for the use of (11.25), since b =2x ]( and k = 2x 2 , we have 

<n-2>0 and = 2 n ° =4,0 

h\ hi 0 2 

regardless of where the second-order partial derivatives are evaluated. Thus d 2 z is every, 
where positive definite, which duly satisfies the sufficient condition for strict convexity. In 
the present example r therefore, (11.24') and (11.25) do yield the same conclusion. 


Convex Functions versus Convex Sets 

Having clarified the meaning of the adjective convex as applied to a function, wc must has¬ 
ten to explain its meaning when used to describe a set. Although convex sels and convex 
functions are not unrelated, they are distinct concepts, and it is important not to confuse 
them. 

For easier intuitive grasp, let us begin with the geometric characterization of a convex 
set. Let S be a set of points in a 2-space or 3-space. If, for any two points in set 5. the line 
segment connecting these two points lies entirely in S, then S is said to be a convex set. It 
should be obvious that a straight line satisfies this definition and constitutes a convex set. 
By convention, a set consisting of a single point is also considered as a convex set, and so 
is the null set (with no point). For additional examples, let us look at Fig. 11.8. The disk 
namely, the “solid" circle, a circle plus all the points within it is a convex set, because a 
line joining any two points in the disk lies entirely in the disk, as exemplified by ab (link¬ 
ing two boundary points) and vd (linking two interior points). Note, however, that a 


FIGURE 11.8 



h 
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FIGURE 11.9 



(hollow) circle is not in itself a convex set. Similarly, a triangle, ora pentagon, is not in it¬ 
self a convex set, but its solid version is. The remaining two solid figures in Fig. 11.8 are 
not convex sets. The palette-shaped figme is reentrant (indented); thus a line segment such 
as gh does not lie entirely in the set. In the key-shaped figure, moreover, we find noi only 
the feature of reentranee, but also the presence of a hole, which is yet another cause oi'non- 
convexity. Generally speaking, to qualify as a convex set. the set of points must contain no 
holes, and its boundary must not be indented anywhere. 

The geometric definition of convexity also applies readily to point sets in a 3-space. For 
instance, a solid cube is a convex set, whereas a hollow cylinder is not. When a 4-space or 
a space of higher dimension is involved, however, the geometric interpretation becomes 
less obvious, We then need to turn to the algebraic definition of convex sets. 

To this end, it is useful to introduce the concept of convex combination ol vectors 
(points), which is a special type of linear combination. A linear combination of two vectors 
u and u can be written as 

k\u +k 2 V 


where k\ and/(j are two scalars. When these two scalars both lie in the closed interval [0,1] 
and add up to unity, the linear combination is said to be a convex combination, and can be 
expressed as 

0u+(l-0)i’ (O<0<1) (11.26) 


As an illustration, the combination 


1 

3 


[21 

2 

V 

0 

. j 

+ 3 

9 j 


is a convex combination, In view of 


the fact that these two scalar multipliers are positive fractions adding up to 1, such a con¬ 
vex combination may be interpreted as a weighted average of the two vectors.* 

The unique characteristic of the combination in (11.26) is that, for every acceptable 
value of 9, the resulting sum vector lies on the line segment connecting the points u and v. 
This can be demonstrated by means of Fig. 11.9, where we have plotted two vectors 


w 


«i 

«2 


and u — 


v\ 

V2 


as two points with coordinates (w,, w 2 ) and (iq, v 2 ), respectively. 


i 


This interpretation has been made use of earlier in the discussion of concave and convex functions. 
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If we plot another vector q such that Oquv forms a parallelogram, then we have (by virtue 
of the discussion in Fig. 4.3) 

u = q -f D or q = u — v 

It follows that a convex combination of vectors u and v (let us call it w) can be expressed in 
terms of vector q , because 

w = Ou + (1 - B)v = 9u + u - 6v = 9{u - e) + v = 9q + v 

Hence, to plot the vector w, we can simply add 9q and v by the familiar parallelogram 
method. If the scalar $ is a positive fraction, the vector Oq will merely be an abridged ver¬ 
sion of vector q\ thus Bq must lie on the line segment Oq. Adding Oq and v, therefore, we 
must find vector w lying on the line segment uv, for the new, smaller parallelogram is noth¬ 
ing but the original parallelogram with the qu side shifted downward. The exact location of 
vector w will, of course, vary according to the value of the scalar 0; by varying 0 from zero 
to unity, the location of w will shift from v to u. Thus the set of all points on the line seg¬ 
ment ill), including u and v themselves, corresponds to the set of all convex combinations 
of vectors u and v. 

In view of the preceding, a convex set may now be redefined as follows: A set S is con¬ 
vex if and only if, for any two points u e S and u e S, and for every scalar 8 e [0,1 ], it is 
true that w = 9u + (1 - 0)v e S. Because this definition is algebraic, it is applicable re¬ 
gardless of the dimension of the space in which the vectors u and v are located. Comparing 
this definition of a convex set with that of a convex function in (11.20), we see that even 
though the same adjective convex is used in both, the meaning of this word changes radi¬ 
cally from one context to the other. In describing ^function, the word convex specifies how 
a curve or surface bends itself—it must form a valley. But in describing a set, the word 
specifies how the points in the set arc “packed” together - they must not allow' any holes to 
arise, and the boundary must not be indented. Thus convex functions and convex sets are 
clearly distinct mathematical entities. 

Yet convex functions and convex sets arc not unrelated. For one thing, in defining a con¬ 
vex function, we need a convex set for the domain. This is because the definition (11.20) 
requires that, for any two points u and v in the domain, all the convex combinations of u 
and 0 —specifically, 9u 8- (1 - 6)v, 0 < 0 < 1—must also be in the domain, which is, of 
course, just another way of saying that the domain must be a convex set. To satisfy this re¬ 
quirement. we adopted earlier the rather strong assumption that the domain consists of the 
entire w-spacc (where n is the number of choice variables), which is indeed a convex set. 
However, with the concept of convex sets at our disposal, we can now substantially weaken 
that assumption. For all we need to assume is that the domain is a convex subset of R”, 
rather than R” itself. 

There is yet another way in which convex functions arc related to convex sets. If j\x) is 
a convex function, then for any constant k, it can give rise to a convex set 

S- = {.v | f(x) < k) f/(.r) convex] (11.27) 

This is illustrated in Fig. 11.10a for the one-variable case. The set .S’- consists of all the 
x values associated with the segment of the f(x) curve lying on or below' the broken hori¬ 
zontal line. Hence it is the line segment on the horizontal axis marked by the heavy dots. 
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Set ,SSet.r' 

(<D (/>) 

which is a convex set. Note that if the k value is changed, the S- set will become a differ¬ 
ent line segment on the horizontal axis, but it will still be a convex set. 

Going a step further, we may observe that even a concave function is related to convex 
sets in ways simitar, First, the definition of a concave function in (11.20) is, like the convex- 
function case, predicated upon a domain that is a convex set. Moreover, even a concave 
function—say. g(.v)— can generate an associated convex set, given some constant k. That 
convex set is 

= |x | g(x) > k] [y(x) concave] (11.28) 

in which the > sign appears instead of <. Geometrically, as shown in Fig. 11.10/) for the 
one-variable case, the set S' contains all the x values corresponding to the segment of the 
g(.v) curve lying on or above the broken horizontal line. Thus it is again a line segment on 
the horizontal axis- a convex set. 

Although Fig. 11.10 specifically illustrates the one-variable case, the definitions of S- 
and S' in (11.27) and (11.28) are not limited to functions of a single variable. They are 

equally valid ifwe interpret x to he a vector, i.e., let x - Ixi,_ x„). In that case, however, 

(11.27) and (11.28) will define convex sots in the /i-space instead, It is important to re¬ 
member that while a convex function implies (11.27), and a concave function implies 
{11.28), the converse is not true —for (11.27) can also be satisfied by a nonconvex function 
and () 1.28) by a nonconcave function. This is discussed further in Sec. 12.4. 


EXERCISE 11.5 

1. Use (11.20) to check whether the following functions are concave, convex, strictly con¬ 
cave, strictly convex, or neither: 

(o) z - x 2 (b) z = x\ + 2xj (c) z = 2x 2 - xy 4- y 2 

2. Use (11,24) or (11.24') to check whether the following functions are concave, convex, 
strictly concave, strictly convex, or neither: 

(i j)z=-x 2 (b) z=(xi -x 2 ) 2 {c)z=-xy 

3. In view of your answer to Prob. 2c, could you have made use of Theorem III of this 
section to compartmentalize the task of checking the function z = 2x 2 - xy + y 2 in 
Prob. lc? Explain your answer. 
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4. Do the following constitute convex sets in the 3-space? 

(a) A doughnut ( b ) A bowling pin (c) A perfect marble 

5. The equation x 2 + y 2 = 4 represents a circle with center at (0,0) and with a radius of 2. 
(a) Interpret geometrically the set |(x, y)\x 2 + y 2 < 4|. 

ft b ) Is this set convex? 


6. Graph each of the following sets, and indicate whether it is convex: 
(°) [(x,Y)\Y = e' t } (c) {(*, y) I y< IB-* 3 } 

(b) ((*, Y)\Y> «*) (d) {(*, y) I xy > 1 ; x > 0, y > 0] 


7. Given u - 
and v? 


10 

6 


and v = 


, which of the following are convex combinations of u 


(o) 


7 

7 


(b) 


5.2 

7.6 




6.2 

8.2 


8. Given two vectors u and v in the 2-space, find and sketch: 

(a) The set of all linear combinations of u and v. 

(b ) The set of all nonnegative linear combinations of u and v, 

(c) The set of all convex combinations of u and v. 

9. (o) Rewrite (11.27) and (11.28) specifically for the cases where the f and g functions 

have n independent variables. 

(b) Let n=2, and let the function fbe shaped like a (vertically held) ice-cream cone 
whereas the function g is shaped like a pyramid. Describe the sets S- and 5-. 


11.6 Economic Applications 


At the beginning of this chapter, the case of a multiproduct firm was cited as an illustration 
of the general problem of optimization with more than one choice variable. Wc are now 
equipped to handle that problem and others of a similar nature. 


Example 1 


Problem of a Multiproduct Firm 

Let us first postulate a two-product firm under circumstances of pure competition. Since 
with pure competition the prices of both commodities must be taken as exogenous, 
these will be denoted by Pio and P 20 , respectively. Accordingly, the firm's revenue function 
will be 

fli = P10 0 ] + P20Q2 

where Q, represents the output level of the ith product per unit of time. The firm’s cost 
function is assumed to be 

C = 2Q 2 } +Qi Qi + 2Ql 

Note that aC/»Qi = 4Qi + Q 2 (the marginal cost of the first product) is a function not 
only of Qi but also of Q 2 . Similarly, the marginal cost of the second product also depends, 
in part, on the output level of the first product. Thus, according to the assumed cost func¬ 
tion, the two commodities are seen to be technically related in production. 

The profit function of this hypothetical firm can now be written readily as 

* = * - C = P 10 Q 1 + P 20 Q 2 - 2Q? - Qi Q 2 - 2Q^ 
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a function of two choice variables (Qi and Q 2 ) and two price parameters. It is our task to 
find the levels of Q\ and O 2 which, in combination, will maximize n. For this purpose, we 
first find the first-order partial derivatives of the profit function: 


’ i, (*^) =p, °" 4q, " q2 
2,2 (■ ^) =p2 “- q, - 4Q2 


(11.29) 


Setting both equal to zero, to satisfy the necessary condition for a maximum, we get the 
two simultaneous equations 

4Qi + Q 2 = P, 0 
Qi +4Q 2 = P 20 

which yield the unique solution 

q; = and q; = 


Thus, if P 10 = 12 and P 2 o = 18, for example, we have Q' = 2 and = 4, implying an 
optimal profit n~ = 48 per unit of time. 

To be sure that this does represent a maximum profit, let us check the second-order con¬ 
dition. The second partial derivatives, obtainable by partial differentiation of (11.29), give 
us the following Hessian: 

-4 -1 
-1 -4 

Since |Hi| = -4 < 0 and |H 2 | — 15 > 0, the Hessian matrix (or d 2 z) is negative definite, 
and the solution does maximize the profit. In fact, since the signs of the leading principal 
minors do not depend on where they are evaluated, d 2 z is in this case everywhere negative 
definite. Thus, according to (11.25), the objective function must be strictly concave, and 
the maximum profit just found is actually a unique absolute maximum. 


*11 

*12 

*21 

*22 


Example 2 


Let us now transplant the problem of Example 1 into the setting of a monopolistic market. 
By virtue of this new market-structure assumption, the revenue function must be modified 
to reflect the fact that the prices of the two products will now vary with their output levels 
(which are assumed to be identical with their sales levels, no inventory accumulation being 
contemplated in the model). The exact manner in which prices will vary with output levels 
is, of course, to be found in the demand functions for the firm's two products. 

Suppose that the demands facing the monopolist firm are as follows: 


Qi = 40 — 2 Pi + ?i 
Q 2 = 15+ Pi - P 2 


(11.30) 


These equations reveal that the two commodities are related in consumption; specifically, 
they are substitute goods, because an increase in the price of one will raise the demand 
for the other. As given, (11.30) expresses the quantities demanded Qi and Q 2 as functions 
of prices, but for our present purposes it will be more convenient to have prices h and h 
expressed in terms of the sales volumes Qi and Q 2 , that is, to have average-revenue func¬ 
tions for the two products. Since (11.30) can be rewritten as 


2 Pi + P 2 = Q, -40 
Pi - P 2 = Q 2 - 15 
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we may {considering Q, and Q 2 as parameters) apply Cramer's rule to solve for and P 2 
as follows: 


Pi as 55 - Qi - 0 2 
P 2 = 70 - 0, - 2Q 2 


(11.30') 


These constitute the desired average-revenue functions, since Pi = ARi and P 2 = AR 2 . 
Consequently, the firm's total-revenue function can be written as 


R = Pi Qi + P 2 Q 2 

- (55 - Qi - Q 2 ) 0 i + (70 - Q, - 2Q 2 ) Q 2 [by (11.30')] 

- 55Q, + 70Q2 - 2Q, Q 2 - Q( - 2 Q 2 
If we again assume the total-cost function to be 

C - Qf + Q, Q 2 + Ql 

then the profit function will be 

ir = R-C = 55 Qi + 70Q 2 - 3Q, Q 1 - IQ] - 1Q\ (11,31) 

which is an objective function with two choice variables. Once the profit-maximizing out¬ 
put levels Q] and Q\ are found, however, the optimal prices P‘ and P 2 * are easy enough to 
find from (11.30 ). 

The objective function yields the following first and second partial derivatives: 

JT! = 55 - 3Q 2 - 4 Qi tt 2 = 70 - 3Qi - 6Q 2 
JT11 = —4 .T 12 = JT 2 1 = — 3 JT 2 2 — —6 

To satisfy the first-order condition for a maximum of n r we must have jti = n 2 = 0; that Is, 


4Qi + 3Q 2 = 55 
3Qi + 6 Q 2 = 70 


Thus the solution output leveis {per unit of time) are 

(Qi. Q 2 ) = (8,7f) 

Upon substitution of this result into (11.30') and (11.31), respectively, we find that 


Pi* = 394 


1-^3 
Inasmuch as the Hessian is 


P* = 46f 


-4 -3 
-3 -6 


and ir* = 488^ 
we have 


(per unit of time) 


|m| = —4 < 0 and |H 2 | = 15>0 


so that the value of n* does represent the maximum profit. Here, the signs of the leading 
principal minors are again independent of where they are evaluated. Thus the Hessian ma¬ 
trix is everywhere negative definite, implying that the objective function is strictly concave 
and that it has a unique absolute maximum. 


Price Discrimination 

Even in a single-product firm, there can arise an optimization problem involving two or 
more choice variables. Such would be the case, for instance, when a monopolistic firm sells 
a single product in two or more separate markets (e.g., domestic and foreign) and therefore 
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must decide upon the quantities (gj. Q 2 . etc.) to be supplied to The respective markets in 
order to maximize profit. The several markets will, in general, have different demand 
conditions, and if demand elasticities differ in the various markets, profit maximization 
will entail the practice of price discrimination. Let us derive this familiar conclusion 
mathematically. 


Example 3 


For a change of pace, this time let us use three choice variables, i.e., assume three separate 
markets. Also, let us work with general rather than numerical functions. Accordingly our 
monopolistic firm will simply be assumed to have total-revenue and total-cost functions as 
follows: 


ft = fti(Qi) + ft2(Q2) + fts(Qs) 

C = C( Q) where Q = Qi + Q 2 + Qi 


Note that the symbol ft; represents here The revenue function of the ith market, rather 
than a derivative in the sense of i\. Each such revenue function naturally implies a particu¬ 
lar demand structure, which will generally be different from those prevailing in the other 
two markets. On the cost side, on the other hand, only one cost function is postulated, 
since a single firm is producing for all three markets. In view of the fact that 0 - 
Qt + Q 2 + Qi, total cost C is also basically a function of Qi, Q 2 , and Q 3 , which constitute 
the choice variables of the model. We can, of course, rewrite C(Q) as C(Ch + O 2 + 03)- It 
should be noted, however, that even though the latter version contains three independent 
variables, the function should nevertheless be considered as having a single argument only, 
because the sum of Q, is really a single entity. In contrast, if the function appears in the form 
C(Qi, O 2 . O 3 ). then there can be counted as many arguments as independent variables. 

Now the profit function is 


?r = ftl(Ql)+ ft 2 (Q 2 ) + ft3(Q3) - C(Q) 

with first partial derivatives 717 = tire /3 Q, (for / = 1,2, 3) as follows:* 


m - fti(Qi)- C'(Q)^- = ftUQi)-C'(Q) 

7i2=R2(Q2)-C'(Q)^- 2 = ll' 2 {Qz)-C'«d 

*3 = RjfQ3) - C'(Q) ^ = Rj(Qj) - C'(Q) 

d V3 


3Q , 

since —^ = I 


since 


since 


3Qi 

UQ_ 

8 Q 2 

;iQ 3 


Setting these equal to zero simultaneously will give us 

C'(Q) = ft;(Q.)=ft 2 (Q 2 ) = ft 3 (Q3) 


(11.32) 


That is, 


MC = MRi = MR 2 = MR 3 


Thus the levels of Qi, Qi, and Q 3 should be chosen such that the marginal revenue in each 
market is equated to the marginal cost of the total output Q. 


* Note that, to find 3C/3Q,, the chain rule is used: 


3C _ dC 3Q 
dQ, dQ 3 0, 
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To see the implications of this condition with regard to price discrimination, let us first 
find out how the MR in any market is specifically related to the price in that market. Since 
the revenue in each market is ft; = P< Q<, it follows that the marginal revenue must be 


dft, „ dp-, n dP, 
! = dQi" P ' dQj + Ql dp. 


= Pi 1 + 


M 0, 

dQ, P, 


= Pi 11 + 




[by (8.4)] 


where the point elasticity of demand in the /th market, is normally negative. Conse- 
quently, the relationship between MR/ and P; can be expressed alternatively by the equation 

MR i=Pi (l- t L) 0,33) 

Recall that !%■( is, in general, a function of Pi, so that when Q* is chosen, and P‘ thus spec¬ 
ified, \e.dj | will also assume a specific value, which can be either greater than, or less than, or 
equal to one. But if |Pd;| < 1 (demand being inelastic at a point), then its reciprocal will 
exceed one, and the parenthesized expression in (11.33) will be negative, thereby implying 
a negative value for MR;. Similarly, if |e dJ -| = 1 (unitary elasticity), then MR, will take a zero 
value. Inasmuch as a firm's MC is positive, the first-order condition MC = MR,- requires 
the firm to operate at a positive level of MR,. Hence the firm's chosen sales levels Q, must 
be such that the corresponding point elasticity of demand in each market is greater than 
one. 

The first-order condition MR-| = MR? = MR 3 can now be translated, via (11,33), into the 
following: 


Pi 1- 


li'dl 


= Pi 1 - 


|f(Cl 


= Pi 1 - 




From this it can readily be inferred that the smaller the value of |%| (at the chosen level of 
output) in a particular market, the highertht price charged in that market must be—hence, 
price discrimination—if profit is to be maximized. 

To ensure maximization, let us examine the second-order condition. From (11.32), the 
second partial derivatives are found to be 


and 


Jhi - fif(Qi) - C"(Q) ^ - /?i'(Qi) - C"(Q) 
*22 = * 2 ((fe) - C"(Q) ^ = « 2 (Q 2 ) - C"(Q) 

f] Q 2 

*33 = k;'(q 3 ) - hq) ^ - m s) - c n (Q) 

7T12 = 7121 = TT] 3 = *31 = 7T23 = ,T32 = -C"(Q) 


since 


_3Q 

3Qj 



so that we have (after shortening the second-derivative notation) 


H 


ft;' - c" 

-C" 

-C" 


-C" -C" 

ft 2 '-C" -C" 

-C" ft"-C" 
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The second-order sufficient condition will thus be duly satisfied, provided we have: 

1 . |Hi | = R‘{ - C < 0; that is, the slope of MRi is less than the slope of MC of the entire 
output [cf. the situation of point L in Fig. 9.6c], (Since any of the three markets can be 
taken as the “first" market, this in effect also implies R 2 - C“ < 0 and fij' - C" < 0.) 

2. |H 2 | = (R'; - C")(fi" - C") - (C") 2 > 0; or, R'{R'j - (/?'' + R' 2 )C" > 0. 

3. |Hj| - R”R”R'j Rj + R 2 '/?3)C' < 0. 

The last two parts of this condition are not as easy to interpret economically as the first. 
Note that had we assumed that the general R,(Q,) functions are all concave and the gen¬ 
eral C( 0) function is convex, so that -C( Q) is concave, then the profit function—the sum 
of concave functions—could have been taken to be concave, thereby obviating the need to 
check the second-order condition. 


Example 4 


To make the above example more concrete, let us now give a numerical version. Suppose 
that our monopolistic firm has the specific average-revenue functions 


P, = 63 - 4Qi so that 

R 1 = P1O1 =63Qi - 4Qf 

P 2 = 105 — 5 Q 2 

R2 = P2Q7 = 105 Q 2 — 5 Q| 

Pi = 75 - 6Qj 

R] - P 3 Q 3 = 75Qs - 6 Ql 

and that the total-cost function is 

C = 20 + 

IS Q 

Then the marginal functions will be 

R\ = 63 - 8 Qt /?2 = 105 — IOQ 2 

/?3 = 75 — I 2 Q 3 C' = 15 


When each marginal revenue fit is set equal to the marginal cost C of the total output, the 
equilibrium quantities are found to be 

Q] » 6 Q 2 = 9 and Q| = 5 

3 

Thus Q* = £ QT = 20 

/=i 

Substituting these solutions into the revenue and cost equations, we get x* = 679 as the 
total profit from the triple-market business operation. 

Because this is a specific model, we do have to check the second-order condition (or 
the concavity of the objective function). Since the second derivatives are 

ff" = -8 /?2 = -10 /?3 = -12 C" = 0 

all three parts of the second-order sufficient conditions given in Example 3 are duly satisfied. 

It is easy to see from the average-revenue functions that the firm should charge the dis¬ 
criminatory prices Pj* = 39, P 2 = 60, and Pf = 45 in the three markets. As you can readily 
verify, the point elasticity of demand is lowest in the second market, in which the highest 
price is charged, 

Input Decisions of a Firm 

Instead of output levels Q, . the choice variables of a firm may also appear in the guise of 
input levels. 
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Example 5 


Consider a competitive firm with the following profit function 

n = R- C = PQ-wL-rK (17.34) 


where P = price 
Q - output 
L = labor 
K = capital 

w, r = input prices for L and K, respectively 


Since the firm operates in a competitive market, the exogenous variables are P, w, and r 
(written here without the zero subscript). There are three endogenous variables, K, L, and Q. 
However output Q is in turn a function of K and L via the production function 

Q = Q(K,L) 

We shall assume it to be a Cobb-Douglas function (further discussed in Sec. 12.6) of the 
form 


Q= L a K fi 


where a and ft are positive parameters. If we further assume decreasing returns to scale, 
then a + ft < 1. For simplicity, we shall consider the symmetric case where a = ft < 5 

Q=L“K a (11.35) 


Substituting (11.35) into (11.34) gives us 

jt (K,L)= PL“K a -wL -rK 
The first-order condition for profit maximization is 

^ = PftL‘ , - , /C“-w = 0 
oL 

~ = PaL°K“ ' -r = 0 


(11.36) 


This system of equations defines the optimal L and AT for profit maximization. But first, let us 
check the second-order condition to verify that we do have a maximum. 

The Hessian for this problem is 


nu 

XiK 


Pa(a - 1 )L a ~ 2 K a 

Po 2 L“ 'K a ' 

*Ki 

xkk 


Pce 2 i“ _1 /f°— 1 

Pu(a — 1)t“/C" 2 


The sufficient condition for a maximum is that | Hi | < 0 and | H \ > 0: 


|H,| = Pa(ft-1)L a - 2 K a <0 
\H\ = P 2 a Ha-D 2 L 2a - i K 1 “- 2 -P 2 a 4 L 1 ° 2 K 2 “ 2 

= P 2 u 2 L 2a - 2 K 2a - 2 ( 1 -2a) > 0 


Therefore, for a < j, the second-order sufficient condition is satisfied. 

We can now return to the first-order condition to solve for the optimal K and L Rewrit¬ 
ing the first equation in (11.36) to isolate K, we get 

PuL a -'K a = w 


K = 


w 

Pot 


1-0 


a 
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Substituting this into the second equation of (11.36), we have 


PaL*K°- ] -r=PaL a 


i 


w 

Pa 


1-u 


r /-1 


-r = 0 


or 

= r 

Rearranging to solve for L then gives us 

i“ = (Paw" l r- 0, ) 1/(1_2 “ 1 

Taking advantage of the symmetry of the model, we can quickly write the optimal K as 

K x = (Pur a -'w~ a y i(] " 2l,) 

L* and K' are the firm's input demand equations. 

If we substitute L * and K * into the production function, we find that 


q* = (l y(rr 

= (PaW a -'r- a ) aA '- 2 a >(Par a -'w- a y t/ °- 2u) 
u 2 p2\ “/h 


wt 


(11.37) 


This gives us an expression for the optimal output as a function of the exogenous variables 
P. w. and r. 


Example 6 


Let us assume the following circumstances: (1)Two inputs a and bare used in the produc¬ 
tion of a single product Q of a hypothetical firm. (2) The prices of both inputs, P a and Pc, 
are beyond the control of the firm, as is the output price P; here we shall denote them 
by P 0 o, Pco, and Po, respectively. (3) The production process takes to years (fo being some 
positive constant) to complete; thus the revenue from sales should be duly discounted 
before it can be properly compared with the cost of production incurred at the present 
time. The rate of discount, on a continuous basis, is assumed to be given at r 0 . 

Upon assumption 1, we can write a general production function Q = Q(o, b), with mar¬ 
ginal physical products Q a and Qt,. Assumption 2 enables us to express the total cost as 

C = P 0 oO+ Pbfrb 


and the total revenue as 

«= PoQ<o,fo) 

To write the profit function, however, we must first discount the revenue by multiplying it 
by the constant which, to avoid complicated superscripts with subscripts, we shall 

write as e~ n . Thus, the profit function is 

jr = P 0 Q(a b)e~ n - P o0 a - P b0 b 

in which a and b are the only choice variables. 

To maximize profit, it is necessary that the first partial derivatives 

= n -P ,o 



Po Qte n - PbO 


( 11 . 38 ) 
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both be zero. This means that 

Pq Qoe~ !t = P a0 and P Q Q b e n = P b0 (11.39) 

Since PoQo (the price of the product times the marginal product of input o) represents the 
value of marginal product of input a (VMP 0 ), the first equation merely says that the present 
value of VMP 0 should be equated to the given price of input a. The second equation is the 
same prerequisite applied to input b. 

Note that, to satisfy (11.39), both marginal physical products Q 0 and Q b must be 
positive, because Po, P 0 o, Pbo, and e~ n all have positive values. This has an important inter¬ 
pretation in terms of an isoquant, defined as the locus of input combinations that yield the 
same output level. When plotted in the ab plane, isoquants will generally appear like those 
drawn in Fig. 11.11. Inasmuch as each of them pertains to a fixed output level, along any 
isoquant we must have 


dQ= Q 0 do-i- Qbdb = 0 


which implies that the slope of an isoquant is expressible as 


db _ Q 0 / MPP„ 
da ~ Q b (- MPPi, 


(11.40) 


Thus, to have both Q a and Q b positive is to confine the firm's input choice to the nega¬ 
tively sloped segments of the isoquants only. In Fig. 11.11, the relevant region of operation 
is accordingly restricted to the shaded area defined by the two so-called ridge lines. Outside 
the shaded area, where the isoquants are characterized by positive slopes, the marginal 
product of one input must be negative. The movement from the input combination at M to 
the one at N, for instance, indicates that with input b held constant the increase in input a 
leads us to a lower isoquant (a smaller output); thus, Q 0 must be negative. Similarly, a 
movement from M to N' illustrates the negativity of Qt,. Note that when we confine our 
attention to the shaded area, each isoquant can be taken as a function of the form b = d(a), 
because for every admissible value of a, the isoquant determines a unique value of b. 
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The second-order condition revolves around the second partial derivatives of 7 , obtain¬ 
able from (11.38). Bearing in mind that Q 0 and Qt>, being derivatives, are themselves func¬ 
tions of the variables a and fa, we can find % 7 ^ = : r&>, and .t&,, and arrange them into a 
Hessian: 


Xqq Jlgb 


PqQ oa* rt 

PvQabe- r[ 

rfob Xbb 


PoQabe r ‘ 

PoQbte rl 


( 11 . 41 ) 


For a stationary value of 7 to be a maximum, it is sufficient that 


\ H] \ < 0 [that is, n va < 0, which can occur iff Q 0fl < 0] 

|H 2 | = |H| > 0 [that is, > 7 ^, which can occur iff QooQbs > Q 2 b 


Thus, we note, the second-order condition can be tested either with the 77 , derivatives or 
the 0 // derivatives, whichever are more convenient. 

The symbol Q oa denotes the rate of change of Q a (= MPP 0 ) as input a changes while 
input b is fixed; similarly, Q bb denotes the rate of change of Qj, (= MPP^,) as input fa changes 
alone. So the second-order sufficient condition stipulates, in part, that the MPP of both 
inputs be diminishing at the chosen input levels 0 * and fa*. Observe, however, that dimin¬ 
ishing MPP 0 and MPPt, does nor guarantee the satisfaction of the second-order condition, 
because the latter condition also involves the magnitude of Q 0 t, = Q&,, which measures the 
rate of change of MPP of one input as the amount of the other input varies. 

Upon further examination it emerges that, just as the first-order condition specifies the 
isoquant to be negatively sloped at the chosen input combination (as shown in the shaded 
area of Fig. 11.11), the second-order sufficient condition serves to specify that same isoquant 
to be strictly convex at the chosen input combination. The curvature of the isoquant is asso¬ 
ciated with the sign of the second derivative d 2 b/da 2 . To obtain the latter, (11.40) must be 
differentiated totally with respect to 0 , bearing in mind that both <? 0 and Qb are derivative 
functions of a and fa and yet, on an isoquant, fa is itself a function of 0 ; that is, 


Q 0 = Qa(o, fa) Qt = Qt(o, fa) 

The total differentiation thus proceeds as follows: 

£fa__d/_Q £ \ 1 

da 2 da l Qb 


and b=<j>{a) 


^ dQ 0 ^ dQb 
Qb —;-Qa 


Qir u da 


da 


( 11 . 42 ) 


Since fa is a function of a on the isoquant, the total-derivative formula (8.9) gives us 


dQa 3 Qo db dQa n db n 

—] — — —T" Z -1-— ViM3-F voo 

da 6b da 6a da 


dQb dQb db dQt _ db 

— — —7— — {- —— — vbbz —I" Qob 
da db da da da 


( 11 . 43 ) 


After substituting (11.40) into (11 -43) and then substituting the latter into (11 .42), we can 
rewrite the second derivative as 

QoaQ& — QiwQo — Qo&Qa + Qfc&Qj 

= --k QM) 2 - 2Qab(Q fl )(Qe)+ QiH)(Qs) 2 ] 

Qb 



( 11 . 44 ) 
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It is to be noted in (11.44) that the expression in brackets (last line) is a quadratic form in 
the two variables Qj and Qj. If the second-order sufficient condition is satisfied, so that 


Q 00 < 0 and 


Qfla — Qob 

“ Qcb Qbb 


then, by virtue of (11.11'), the said quadratic form must be negative definite. This will in 
turn make tfb/da 2 positive, because Qj, has been constrained to be positive by the first- 
order condition. Thus the satisfaction of the second-order sufficient condition means that 
the relevant (negatively sloped) isoquant is strictly convex at the chosen input combination, 
as was asserted. 

The concept of strict convexity, as applied to an isoquant b = <t>{o), which is drawn in the 
two-dimensional ab plane, should be carefully distinguished from the same concept as 
applied to the production function Q(o, b ) itself, which is drawn in the three-dimensional 
abQ space. Note, in particular, that if we are to apply the concept of strict concavity or 
convexity to the production function in the present context, then, to produce the desired 
isoquant shape, the appropriate stipulation is that Q(o, b) be strictly concave in the 3-space 
(be dome-shaped), which is in sharp contradistinction to the stipulation that the relevant 
isoquant be strictly convex in the 2-space (be U-shaped, or shaped like a part of a U), 


Example 7 Next, su PP ose that interest is compounded quarterly instead, at a given interest rate of / 0 

- per quarter. Also suppose that the production process takes exactly a quarter of a year. The 

profit function then becomes 

X = P 0 Q(a,b}(\ +i Q )~ ] - P a0 a- P b0 b 

The first-order condition is now found to be 

p oQ«i(1 + io)~ ] - PaO = 0 
PoQsO + fo) ’ 1 - PbO = 0 

with an analytical interpretation entirely the same as in Example 6, except for the different 
manner of discounting. 

You can readily see that the same sufficient condition derived in Example 6 must apply 
here as well. 


EXERCISE 11.6 

1. If the competitive firm of Example 1 has the cost function C = 2 Qj + 2Q% instead, 
then: 

(a) Will the production of the two goods still be technically related? 

(t>) What will be the new optimal levels of Qi and Q 2 ? 

(c) What is the value of -ti 2 ? What does this imply economically? 

2, A two-product firm faces the following demand and cost functions: 
Qi=4Q-2Pi-P 2 Q 2 = 35-Pi-P 2 C = Q? + 2Q^ + 10 

(o) Find the output levels that satisfy the first-order condition for maximum profit. (Use 
fractions.) 

(b) Check the second-order sufficient condition. Can you conclude that this problem 
possesses a unique absolute maximum? 

(c) What is the maximal profit? 
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3. On the basis of the equilibrium price and quantity in Example 4, calculate the point 
elasticity of demand |f*| (for i = 1,2). Which market has the highest and the lowest 
demand elasticities? 

4. If the cost function of Example 4 is changed to C = 20 + 15 Q + Q 2 

(a) Find the new marginal-cost function. 

(b) Find the new equilibrium quantities. (Use fractions.) 

(c) Find the new equilibrium prices. 

(d) Verify that the second-order sufficient condition is met. 

5. In Example 7, how would you rewrite the profit function if the following conditions 
hold? 

(o) Interest is compounded semiannually at an interest rate of Iq per annum, and the 
production process takes 1 year.. 

(fo) Interest is compounded quarterly at an interest rate of io per annum, and the pro¬ 
duction process takes 9 months. 

6. Given Q = Q(o, b), how would you express algebraically the isoquant for the output, 
level of, say, 260? 


11,7 Comparative-Static Aspects of Optimization _ 

Optimization, which is a special variety of static equilibrium analysis, is naturally also sub¬ 
ject to investigations of the comparative-static sort. The idea is, again, to find out how a 
change in any parameter will affect the equilibrium position of the model, which in the 
present context refers to the optimal values of the choice variables (and the optimal value 
of the objective function). Since no new technique is involved beyond those discussed in 
Part 3, we may proceed directly with some illustrations, based on the examples introduced 
in Sec. 11.6. 


Reduced-Form Solutions 

Example 1 of Sec. 11.6 contains two parameters (or exogenous variables), P w and Ao; it 
is not surprising, therefore, that the optimal output levels of this two-product firm are 
expressed strictly in terms of these parameters: 


Q> 


4P]0 - ^20 


15 


and 


Q\ 


4 P 20 ~ Pi i) 
15 


These are reduced-form solutions, and simple partial differentiation alone is sufficient to 
tell us all the comparative-static properties of the model, namely, 


M = ± M = M = _! M = ± 

HP m 15 dPza 15 ifPid 15 <) Pxi 15 


For maximum profit, each product of the firm should be produced in a larger quantity if its 
market price rises or if the market price of the other product falls. 

Of course, these conclusions follow only from the particular assumptions of the model 
in question. We may point out, in particular, that the effects of a change in Pu, on Q\ and 
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oi Pic on Q*, are consequences of the assumed technical relation on the production side of 
these two commodities, and that in the absence of such a relation we shall have 


££L 

BPjo 


302 


0 


Moving on to Example 2, we note that the optimal output levels are there stated, numer¬ 
ically, as Q\ = 8 and Q\ = l \—no parameters appear. In fact, all the constants in the 
equations of the model are numerical rather than parametric, so that by the time we reach 
the solution stage those constants have all lost their respective identities through the 
process of arithmetic manipulation. What this serves to underscore is the fundamental lack 
of generality in the use of numerical constants and the consequent lack of comparative- 
static content in the equilibrium solution. 

On the other hand, the nonuse of numerical constants is no guarantee that a problem will 
automatically become amenable to comparative-static analysis. The price-discrimination 
problem (Example 3), for instance, was primarily set up for the study of the equilibrium 
(profit-maximi7ation) condition, and no parameter was introduced at all. Accordingly, 
even though stated in terms of general functions, a reformulation will be necessary if a 
comparative-static study is contemplated. 


General-Function Models 

The input-decision problem of Example 6 illustrates the case where a general-function 
formulation docs embrace several parameters—in fact, no less than five (Pc, P„c- P/,o, r, 
and r), where we have, as before, omitted the 0 subscripts from the exogenous variables rr, 
and to. How do we derive the comparative-static properties of this model? 

The answer lies again in the application of the implicit-function theorem. But, unlike the 
cases of nongoal-equilibrium models of the market or of national-income determination, 
where wc worked with the equilibrium conditions of the model, the present context of goal 
equilibrium dictates that we work with the first-order conditions of optimization. For 
Example 6, these conditions are stated in (11.39). Collecting all terms in (11.39) to the left 
of the equals signs, and making explicit that Q a and Qj, are both functions of the endoge¬ 
nous (choice) variables a and b. we can rewrite the first-order conditions in the format 
of (8.24) as follows: 

F'(a, b\ Pc. P a c, Pi, 0 , r, t ) = PcQ u (a, b)e~ n - P a0 = 0 
, (11.45) 

P 2 (a, b\ P^ P a c, Pba, r , /) = PcQ/fa. h)e~ n - P b „ = 0 

The functions F [ and F 2 are assumed to possess continuous derivatives. Thus it would be 
possible to apply the implicit-function theorem, provided the Jacobian of this system with 
respect to the endogenous variables a and b does not vanish at the initial equilibrium. The 
said Jacobian turns out to be nothing but the Hessian determinant of the n function of 
Example 6: 


dF' dF' 
da db 
dF 2 3F 2 


PaQaae-" 

PcQa>,e- r ‘ 


PoQd>e- rl 

PoQhh ^' 1 


= \H\ 


[by (11.41)] 


(11.46) 


da )b 
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Hence, if we assume that the second-order sufficient condition for profit-maximization is 
satisfied, then \H\ must be positive, and so must be \J\, at the initial equilibrium or opti¬ 
mum. In that event, the implicit-function theorem will enable us to write the pair of implicit 
functions 


a = a*(P 0> P a o, Ph(uro) 
b* = />'(P 0 , Pf/Oi no, r, t) 

as well as the pair of identities 

Po£?,,(<P, - P.u - 0 

PvQ h {a\b*)e "-Ao = 0 


(11.47) 


(11.48) 


To study the comparative statics of the model, first take the total differential of each 
identity in (11.48). For the time being, we shall permit all the exogenous variables to vary, 
so that the result of total differentiation will involve da*, db *, as well as dP G , dP a \), dP}, o, 
dr, and df. If we place on the left side of the equals sign only those terms involving da' and 
db*, the result will be 

hQ aa e~ Tr da* + PzQ ah e~ rl db* = - Q a e~ r 'dP (i + dP a , 

+ Pofiflfe ' rt dr + P„Q a re~ rl dt 
P n Q, lt> e- rl da* + P {] Q Ml e- rl db* = -Q h e- r, dPo + dP b(} 

+ P 0 Q h te~ r ‘dr + PaQ b re "dt 


where, be it noted, the first and second derivatives of Q are all to be evaluated at the equi¬ 
librium, i.e., at a* and b\ You will also note that the coefficients of da* and db* on the left 
are precisely the elements of the Jacobian in (11.46). 

To derive the specific comparative-static derivatives—of which there are a total of 10 
(why?)—we now shall allow only a single exogenous variable to vary at a time. Suppose we 
let P{) vary, alone. Then dPo ^ 0, but dP u o = dPw = dr = dl = 0, so that only the first 
term will remain on the right side of each equation in (11.49). Dividing through by dP§, 
and interpreting the ratio da*/dPo to be the comparative-static derivative (AaVD/fo), and 
similarly for the ratio db*/dP^, we can wr ^ ,e matr ’ x equation 


'PoQ aa e~ rt 

PvQ«»e 

'(da* mV 


~-Q u e-‘ r 

m P»Qabe- r ‘ 

PoQbhe-’\ 

wm). 


_-£V". 


The solution, by Cramer's rule, is found to be 

(Bq*\ = (Q b Q ah -Q a Q bb )P ti e- 2rt 
\dpj \J\ 

Bb*\ {Q a Q ah -Q b Q aa )Pge- 2rl 

BPj \J\ 


(11.50) 


If you prefer, an alternative method is available for obtaining these results: You may simply 
differentiate the two identities in {11.48) totally with respect to ft, (while holding the other 
four exogenous variables fixed), bearing in mind that P» can affect a* and h* via (11.47). 
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Let us now analyze the siyns of the comparative-static derivatives in (11.50). On the 
assumption that the second-order sufficient condition is satisfied, the Jacobian in the 
denominator must be positive. The second-order condition also implies that Q aa and (V 
are negative, just as die first-order condition implies that Q u and Qh are positive. Moreover, 
the expression P^e~ ln is certainly positive. Thus, if Q 0 /-, > 0 (if increasing one input will 
raise the MPP of the other input), we can conclude that both (Da'/ilPa) and (db'/'dPo) will 
be positive, implying that an increase in the product price will result in increased employ¬ 
ment of both inputs in equilibrium. If Q ah < 0, on the other hand, the sign of each deriva¬ 
tive in (11.50) will depend on the relative strength of the negative force and the positive 
force in the parenthetical expression on the right. 

Next, let the exogenous variable r vary, alone. Then all the terms on the right of (11.49) 
will vanish except rhose involving dr. Dividing through by dr ^ 0, we now obtain the 
following matrix equation 


PoQ aa e- r ’ 



{da*/dr) 


P«Q a (e- n 

AQ«l>e~ r ' 

PvQtbe-". 


m (3b*/dr)_ 


_P,Q b te- rl _ 


with the solution 


3fl“\ 

t{Qa Qbb 

- QbQahHPot -") 2 

Hr J 


J 1 

8b* \ 

tiQhQaa - 

- QuQMc-’ r ? 



■J 1 


(11.51) 


Both of these comparative-static derivatives will be negative if Q a i, is positive, but indeter¬ 
minate in sign if Q ah is negative. 

By a similar procedure, we may find the effects of changes in the remaining parameters. 
Actually, in view of the symmetry between /-and i in (11.48) it is immediately obvious that 
both {‘da*/Dr) and (db*/St) must be similar in appearance to (11.51). 

The effects of changes in P a d and Phi, are left to you to analyze. As you will find, the sign 
restriction of the second-order sufficient condition will again be useful in evaluating the 
comparative-static derivatives, because it can tell us the signs of Q, w and Q),/, as well as the 
Jacobian \ J\ at the initial equilibrium (optimum). Thus, aside from distinguishing between 
maximum and minimum, the second-order condition also has a vital role to play in the 
study of shifts in equilibrium positions as well. 


EXERCISE 11.7 

For Probs.T through 3, assume that Q 0 (, > 0. 

1. On the basis of the model described in (11,45) through (11.48), find the comparative- 
static derivatives (3o*/3P 0 o) and (3b*/3Poo). Interpret the economic meaning of the 
resuit, Then analyze the effects on a* and fi* of a change in 

2. For the problem of Example 7 in Sec. 11.6: 

(a) How many parameters are there? Enumerate them. 

(b) Following the procedure described in (11.45) through (11.50), and assum ing that 
the second-order sufficient condition is satisfied, find the comparative-static 
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derivatives (iio*/il Pq) and (Htf/iiPo). Evaluate their signs and interpret their eco¬ 
nomic meanings. 

(c) Find (;)a'/3io) and (dtf/dio), evaluate their signs, and interpret their economic 
meanings. 

3. Show that the results in (11.50) can be obtained alternatively by differentiating the two 
identities in (11.48) totally with respect to P 0 , while holding the other exogenous vari¬ 
ables fixed. Bear in mind that Pq can affect a* and b* by virtue of (11.47). 

4. A Jacobian determinant, as defined in (7.27), is made up of first-order partial deriva¬ 
tives. On the other hand, a Hessian determinant, as defined in. Secs. 11.3 and 11,4, has 
as its elements second-order partial derivatives. How, then, can it turn out that 
1/1 = |H|, as in (11.46)? 



Chapter 


Optimization with 
Equality Constraints 


Chapter 11 presented a general method for finding the relative extrema of an objeetive 
function of two or more choice variables. One important feature of that discussion is that 
all the choice variables are independent of one another, in the sense that the decision made 
regarding one variable does not impinge upon the choices of the remaining variables. For 
instance, a two-product firm can choose any value for Q\ and any value for Qi it wishes, 
without the two choices limiting each other. 

If the said firm is somehow required to observe a restriction (such as a production quota) 
in the form of Q\ 4- Qi ~ 950, however, the independence between the choice variables 
will be lost. In that event, the firm’s profit-maximizing output levels Q * and Q\ will be not 
only simultaneous but also dependent, because the higher Q\ is, the lower Q\ must corre¬ 
spondingly be. in order to stay within the combined quota of 950. The new optimum satis¬ 
fying the production quota constitutes a constrained optimum, which, in general, may be 
expected to differ from th efree optimum discussed in Chap, 11. 

A restriction, such as the production quota mentioned before, establishes a relationship 
between the two variables in their roles as choice variables, but this should be distinguished 
from other types of relationships that may link the variables together. For instance, in Ex¬ 
ample 2 of Sec. 11.6, the two products of the firm are related in consumption (substitutes) 
as well as in production (as is reflected in the cost function), but that fact does not qualify 
the problem as one of constrained optimization, since the two output variables are still in¬ 
dependent as choice variables. Only the dependence of the variables qua choice variables 
gives rise to a constrained optimum. 

In the present chapter, we shall consider equality constraints only, such as £>, + Qi = 
950. Our primary concern will be with relative constrained extrema, although absolute 
ones will also be discussed in Sec. 12.4. 

12.1 Effects of a Constraint 


The primary purpose of imposing a constraint is to give due cognizance to certain limiting 
factors present in the optimization problem under discussion. 


547 
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We have already seen the limitation on output choices that result from a production 
quota. For further illustration, let us consider a consumer with the simple utility (index) 
function 


U = x\xi + 2x] (12.1) 

Since the marginal utilities—the partial derivatives U\ = dU/dx\ and Ui = dU/ 9*2 are 
positive for all positive levels of x \ and xi here, to have U maximized without any con¬ 
straint, the consumer should purchase an infinite amount of both goods, a solution that 
obviously has little practical relevance. To render the optimization problem meaningful, the 
purchasing power of the consumer must also be taken into account; i.e., a budget constraint 
should be incorporated into the problem. If the consumer intends to spend a given sum, say, 
$60, on the two goods and if the current prices are Pio = 4 and P 20 = 2, then the budget 
constraint can be expressed by the linear equation 

4x1+2x2 = 60 (12.2) 

Such a constraint, like the production quota referred to earlier, renders the choices of x* 
and *2 mutually dependent. 

The problem now is to maximize (12.1), subject to the constraint stated in (12.2). Math¬ 
ematically, what the constraint (variously called restraint, side relation, or subsidiary 
condition) docs is to narrow the domain, and hence the range of the objective function. The 
domain of (12.1) would normally be the set |(xi,x 2 ) I xj > 0,X2 > 0). Graphically, the 
domain is represented by the nonnegativc quadrant of the X|X 2 plane in Fig. 12.1a. After 
the budget constraint (12.2) is added, however, we can admit only those values of the vari¬ 
ables which satisfy this latter equation, so that the domain is immediately reduced to the set 
of points lying on the budget lino. This will automatically affect the range of the objective 
function, too; only that subset of the utility surface lying directly above the budget-constraint 
line will now be relevant. The said subset (a cross section of the surface) may look like the 
curve in Fig. 12. lb, where J is plotted on the vertical axis, with the budget line of diagram 
a placed on the horizontal axis. Our interest, then, is only in locating the maximum on the 
curve in diagram b. 

In general, for a function : = /(x. y), the difference between a constrained extremum 
and a free extremum may be illustrated in the three-dimensional graph of Fig. 12.2. The 
free extremum in this particular graph is the peak point of the entire dome, but the con¬ 
strained extremum is at the peak of the inverse U-shaped curve situated on top of(i.c., lying 
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directly above) the constraint line. In general, a constrained maximum can be expected to 
have a lower value than the free maximum, although, by coincidence, the two maxima may 
happen to have the same value. But the constrained maximum can never exceed the free 
maximum. 

It is interesting to note that, had we added another constraint intersecting the first con¬ 
straint at a single point in the xy plane, the two constraints together would have restricted 
the domain to that single point. Then the locating of the extremum would become a trivial 
matter. In a meaningful problem, the number and the nature of the constraints should be 
such as to restrict, but not eliminate, the possibility of choice. Generally, the number of 
constraints should be less than the number of choice variables. 


12.2 Finding the Stationary Values _ 

Even without any new technique of solution, the constrained maximum in the simple 
example defined by (12.1) and (12.2) can easily be found. Since the constraint (12.2) 
implies 

60 -4.x, 

x 2 = —-—- ^ 30 - 2x\ (12.2') 

we can combine the constraint with the objective function by substituting (12.2') into 
(12.1). The result is an objective function in one variable only: 

U = jci (30 - 2x0 + 2vj = 32xi - 2x\ 

which can be handled with the method already learned. By setting dJjdx\ - 32 - 4a t 
equal to zero, we get the solution x* = 8, which by virtue of (12.2') immediately leads to 
xj = 30 - 2(8) - 14. From (12.1), we can then find the stationary value (/’ = 128; and 
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since the second derivative is d 2 J/d: vf = -4 < 0, that stationary value constitutes a (con¬ 
strained) maximum of C/. 1 

When the constraint is itself a complicated function, or when there are several con¬ 
straints to consider, however, the technique of substitution and eliminalion of variables 
could become a burdensome task. More importantly, when the constraint comes in a form 
such that wc cannot solve it to express one variable (aq) as an explicit function of the other 
(*,), the elimination method would in fact be of no avail—even if x 2 were known to be 
an implicit function of *i, that is, even if the conditions of the implicit-function theorem 
were satisfied. In such cases, we may resort to a method known as the method of Lagrange 
(undetermined) multiplier, which, as we shall see, has distinct analytical advantages. 

Lagrange-Multiplier Method 

The essence of the Lagrange-multiplier method is to convert a constrained-extremum prob¬ 
lem into a form such thai the first-order condition of the free-extremum problem can still 
be applied. 

Given the problem of maximizing U = x\x 2 + 2x\, subject to the constraint 4*i 4 
2x 2 = 60 [from (12.1) and (12.2)], let us write what is referred to as the Lagrangian func¬ 
tion, which is a modified version of the objective function that incorporates the constraint 
as follows: 

Z = * 1*2 + 2*i + A(60 - 4*i - 2*7) (12.3) 

The symbol A (the Greek letter lambda), representing some as yet undetermined number, is 
called a Lagrange (undetermined) multiplier. If we can somehow be assured that 4*| + 
2*2 = 60, so that the constraint will be satisfied, then the last term in (12.3) will vanish 
regardless of the value of A. In thai event, Z will be identical with U. Moreover, with the 
constraint out of the way, wc only have to seek the free maximum of Z. in lieu ol the 
constrained maximum of J , with respect to the two variables *1 and * 2 - The question is: 
How can wc make the parenthetical expression in (12.3) vanish? 

The tactic that will accomplish this is simply to treat A as an additional choice variable 
in (12.3), i.e., to consider Z = Z(A,*i,* 2 ). h'or then the first-order condition for free 
extremum will consist of the set of simultaneous equations 

Z;.(= 3Z/0A) = 60 - 4*i - 2*2 = 0 

Z,(= 3Z/3*i) = * 2 + 2 - 4A = 0 (12.4) 

Z 2 (= dZjdx 2 ) = *1 — 2A = 0 

and the first equation will automatically guarantee the satisfaction of the constraint. Thus, 
by incorporating the constraint into the Lagrangian function Z and by treating the Lagrange 
multiplier as an extra variable, we can obtain the constrained extremum U* (two choice 
variables) simply by screening the stationary values of Z, taken as a free function of three 
choice variables. 

Solving (12.4) for the critical values of the variables, we find *[ = 8, *j = 14 (and 
A’ = 4). As expected, the values of** and *2 cheek with the answers already obtained by 

1 You may recall that for the flower-bed problem of Exercise 9.4-2 the same technique of substitution 
was applied to find the maximum area, using a constraint (the available quantity of wire netting) to 
eliminate one of the two variables (the length or the width of the flower bed). 



Chapter 1 2 Optimization with Equality Constraint* 351 


Example 1 


the substitution method. Furthermore, it is clear from (12.3) that Z" — 128; this is identi¬ 
cal with the value of U* found earlier, as it should be. 

In general, given an objective function 


subject to the constraint 


z = f[x,y) 


K(x,y) = c 

where c is a constant, T we can write the Lagrangian function as 

Z = f{x,y)+X[c-g(x,y)] 


( 12 . 5 ) 

( 12 . 6 ) 

( 127 ) 


For stationary values of 2, regarded as a function of the three variables X. .r. and y, the 
necessary condition is 


Z k =c- g(x, y ) = 0 

Z x = f s -X gl = Q ( 12 . 8 ) 

Zy = jy ™ Xgy = 0 

Since the first equation in (12.8) is simply a restatement of (12.fi), the stationary values of 
the Lagrangian function Z will automatically satisfy the constraint of the original function 
z. And since the expression k[c - g(x, y)j is now assuredly zero, the stationary values of Z 
in (12.7) must be identical with those of (12.5), subject to (12.6). 

Let us illustrate the method with two more examples. 

Find the extremum of 


z=xy subjeetto x + y=6 
The first step is to write the Lagrangian function 

Z = xy + 7.(6 - x - y) 

For a stationary value of Z, it is necessary that 

x+y = 6 


Z x = 6- x-y = 0 

Z x = y-k = 0 

Zy = X — k = 0 


or 


—X + y = 0 
—X + x = Q 


Thus, by Cramer's rule or some other method, we can find 

A* = 3 x* = 3 y" = 3 


The stationary value is Z r = f = 9, which needs to be tested against a second-order con¬ 
dition before we can tell whether it is a maximum or minimum (or neither). That will be 
taken up in Sec. 12.3. 


! It is also possible to subsume the constant c under the contraint function so that (12.6) appears 
instead as C(x, y) = 0, where C(x, y) = g(x, y) -c. in that case, (12,7) should be changed to 1 = 
f{x, y) + X[Q - G(x, y)] = t{x, y) - ).G(x, y). The version in (12.6) is chosen because it facilitates the 
study of the comparative-static effect of a change in the constraint constant later [see (12.16)]. 
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Example 2 


Find the extremum of 

z = x\ + *2 subject to Xi + 4X2 = 2 


The Lagrangian function is 

Z = x\ + xf + a(2 - K\ - 4x 2 ) 
for which the necessary condition for a stationary value is 


Z A = 2 - xi - 4 x 2 = 0 
Z\ = 2x] - a = 0 


or 


Zz = 2x2 - 4X = 0 


xi + 4x 2 = 2 

—l + 2xi =0 

-41 +2x2 = 0 


The stationary value of Z, defined by the solution 

_2_ y*— A 

4 - 17 *1 ~ 17 A 2 — 17 

is therefore Z* = z* = ^. Again, a second-order condition should be consulted before we 
can tell whether z' is a maximum or a minimum. 


Total-Differential Approach 

In the discussion of the free extremum of z = f(x,y), it was learned that the first-order 
necessary condition may be stated in terms of the total differential dz as follows: 

dz = f x dx + f y dy = 0 (12.9) 


This statement remains valid after a constraint g(x. y) = c is added. However, with the 
constraint in the picture, we can no longer take both dx and dy as “arbitrary” changes as 
before. For if g{x, y) = c, then dg must be equal to dc, which is zero since c is a constant. 
Hence, 

[dg =)gt dx + g y dy = 0 ( 12 . 10 ) 


and this relation makes dx and dy dependent on each other, The first-order necessary con¬ 
dition therefore becomes dz = 0 [(12.9)], subject to g = c, and hence also subject to 
dg - 0 [(12.10)]. By visual inspection of (12.9) and (12.10), it should be clear that, in 
order to satisfy this necessary condition, we must have 


f l = f L 

& gv 


( 12 . 11 ) 


This result can be verified by solving (12.10) for dy and substituting the result into (12.9). 
The condition (12.11), together with the constraint g(x, y) = c, will provide two equations 
from which to find the critical values ofx andy.' 

Does the total-differential approach yield the same first-order condition as the L-agrange- 
multiplier method? Let us compare (12.8) with the result just obtained. The first equation 


! Note that the constraint g = c is still to be considered along with (12,11), even though we have 
utilized the equation dg = 0—that is, (12.10)—in deriving (12.11). While g- c necessarily implies 
dg = 0, the converse is not true: dg = 0 merely implies g = a constant (not necessarily c). Unless the 
constraint is explicitly considered, therefore, some information will be unwittingly left out of the 
problem. 
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in (12.8) merely repeats the constraint; the new result requires its satisfaction also. The last 
two equations in (12.8) can be rewritten, respectively, as 

—— = X and ^- = X (12.1V) 

gx gr¬ 

and these convey precisely the same information as (12.11). Note, however, that whereas 
the Total-differential approach yields only the values of x* and y", the Lagrange-multiplier 
method also gives the value ofX* as a direct by-product. As it turns out. a" provides a mea¬ 
sure of the sensitivity of Z' (and z*) to a shift of the constraint, as we shall presently 
demonstrate. Therefore, the Lagrange-multiplier method offers the advantage of containing 
certain built-in eomparative-statie information in the solution. 


An Interpretation of the Lagrange Multiplier 

To show that a* indeed measures the sensitivity of Z* to changes in the constraint, let us 
perform a comparative-static analysis on the first-order condition (12,8). Since A, x, and y 
arc endogenous, the only available exogenous variable is the constraint parameter c. A 
change in c would cause a shift of the constraint curve in the xy plane and thereby alter the 
optimal solution. In particular, the effect of an increase in c (a larger budget, or a larger pro¬ 
duction quota) would indicate how the optimal solution is affected by a relaxation of the 
constraint. 

To do the comparative-static analysis, we again resort to the implicit-function theorem. 
Taking the three equations in (12.8) to be in the form of FHk, x>y\ c) = 0 (with 
j = 1,2,3), and assuming them to have continuous partial derivatives, we must first check 
that the following endogenous-variable Jacobian (where f xv = f vx , and g xy = g )X ) 


dF' 

dF' 

dF' 

3 A 

dx 

dy 

dF 2 

dF 2 

dF 2 

3 A 

dx 

dy 

dF 3 

dF 3 

dF 3 

dX 

dx 

civ 


0 

-gx 


~gy 


-gx 

fxs. -Xg tx 

fxy 

~ Ag„- 

(12.12) 

—gv 

lx? — I'gxy 

fy v 

- 



does not vanish in the optimal state. At this moment, there is certainly no inkling that this 
would be the case. But our previous experience with the comparative statics of optimiza¬ 
tion problems [see the discussion of (11.46)] would suggest that this Jacobian is closely re¬ 
lated to the second-order sufficient condition, and that if the sufficient condition is satisfied, 
then the Jacobian will be nonzero at the equilibrium (optimum). Leaving the full demon¬ 
stration of this fact to Sec. 12,3, let us proceed on the assumption that |J| ^ 0. If so, then 
we can express A*,x“, and}'* all as implicit functions of the parameter c; 

\* = A *(c) A* = JC*(C) and / = /(c) (12.13) 


all of which will have continuous derivatives. Also, wc have the equilibrium identities 

c-g(x\y*)mQ 
fi(x' 9 y*)-l.*g x (x\y 4 ) = Q 
fy(x* 9 y')-y-'g y {x\y')= 0 


(12.14) 
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Now since the optimal value of Z depends on A”, x*, and v*, that is, 

r = ./K,/) + A t [c-g(^ ) /)] (12.15) 

we may, in view of (12.13), consider f to be a function of c alone. Differentiating Z* 
totally with respect to c, we find 


dZ* r dx* ,dv* r dx * df 


dx* dv* 

= (A - ^7 + (/;•- *.'gy) + [<■ - £(-*•*, /)] dt 


* +r 


where / T , / v .,g Xl and g y arc all to be evaluated at the optimum. By (12.14), however, the 
first three terms on the right will all drop out. Thus we are left with the simple result 


dZ % 

dc 


= r 


(12.16) 


which validates our claim that the solution value of the Lagrange multiplier constitutes a 
measure of the effect of a change in the constraint via the parameter c on the optimal value 
of the objective function. 

A word of caution, however, is perhaps in order here. For this interpretation of A*, you 
must express Z specifically as in (12.7). In particular, write the last term as A[o - g(x, > )], 
«<?;A[g(x,y)-c]. 


n-Variable and Multiconstraint Cases 

The generalization of the Lagrange-multiplier method to n variables can be easily carried 
out if we write the choice variables in subscript notation. The objective function will then 
be in the form 

z= /(x |,X2, ...,x„) 

subject to the constraint 

g{x u x 2 ,...,x„) = c 

ll follows that the Lagrangian function will be 

Z = f(x\, X2, • • -, x„) + Afc - g(X],X2, ■ ..,X„)] 

for which the first-order condition will consist of the following (« + 1) simultaneous 
equations: 

Z* = c-g(x i,x 2 .*«) = 0 

Z\ =/] - Agi =0 

Zl = jl - Ag2 = 0 


Z n -f„- Ag„ = 0 

Again, the first of these equations will assure us that the constraint is met, even though we 
are to focus our attention on the free Lagrangian function. 
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When there is more than one constraint, the Lagrange-multiplier method is equally applic¬ 
able, provided we introduce as many such multipliers as there are constraints in the Lagrangian 
function. Let an /7-variable function be subject simultaneously to the two constraints 

g(X[.x 2 , ...,x„) =c and /j(cc[.or 2 , ...,x„)=d 

Then, adopting /. and /j (the Greek letter mu) as the two undetermined multipliers, we may 
construct a Lagrangian function as follows: 

Z = f(x 1, A"2• .. - ,x„) + a[c -g(X],X2. ... ,x„(] - n\d - h{X\,Xl .T.,)] 

This function will have the same value as the original objective function /’if both con¬ 
straints arc satisfied, i.c., if the last two terms in the Lagrangian function both vanish. 
Considering a and fi as choice variables, wc now count (n +2) variables, thus (he lirst- 
ordcr condition will in this case consist of the following (n + 2) simuliancous equations: 

Z ; . =c- gUi,x 2 , ...,.v„) =0 

Z lt =d-h{x i ,.*2 .x„) = 0 

'A = f - Agi - fih, = 0 (/ = 1,2,....«) 

These should normally enable us to solve for all the x, as well as X and ji. As before, the 
first two equations of the necessary condition represent essentially a mere restatement of 
the two constraints. 


EXERCISE 12.2 

1. Use the Lagrange-multiplier method to find the stationary values of z: 

(a) z = xy, subject to x + 2y — 2. 

(b) z- x(y + 4 ), subject to x + y= 8. 

(c) z= x - 3y- xy, subject to x + y = 6. 

(d) z= 7 - y+ x 2 , subject to x + y = 0. 

2. In Prob. 1, find whether a slight relaxation of the constraint will increase or decrease 
the optimal value of z. At what rate? 

3. Write the Lagrangian function and the first-order condition for stationary values (with¬ 
out solving the equations) for each of the following: 

(a) z= x - 2y + 3w + xy - yw, subject to x -t- y + Iw = 10. 

( b) z- x 2 + 2xy + yw 2 , subject to lx + y + w 2 = 24 and x + w = 8. 

4. If, instead of g(x, y) = c, the constraint is written in the form of C(x, y) = 0, how 
should the Lagrangian function and the first-order condition be modified as a 
consequence? 

5. In discussing the total-differential approach, it was pointed out that, given the con¬ 
straint g(x, y) = c, we may deduce that dg = 0. By the same token, we can further 
deduce that d 2 g = d(dg) = d(0) = 0. Yet, in our earlier discussion of the unconstrained 
extremum of a function z - f(x, y), we had a situation where dz= 0 is accompanied 
by either a positive definite ora negative definite d 2 z, rather than d 2 z- 0. How would 
you account for this disparity of treatment in the two cases? 

6. If the Lagrangian function is written as Z = f(x, y ) + y) - c} rather than as in 
(127), can we still interpret the Lagrange multiplier as in (12.16)? Give the new inter¬ 
pretation, if any. 




356 Part four Optimization Problems 


12.3 Second-Order Conditions 


The introduction of a Lagrange multiplier as an additional variable makes it possible to 
apply to the constrained-extremum problem the same first-order condition used in the free- 
extremum problem. It is tempting to go a step further and borrow the second-order 
necessary-and-suffieient conditions as well. This, however, should not be done. For even 
though Z* is indeed a standard type of extremum with respect to the choice variables, it is 
not so with respect lo the Lagrange multiplier. Specifically, we can see from (12.15) that, 
unlike and / .if / is replaced by any other value of a, no effect will be produced on 

since [c - g(x*, /)] is identically zero. Thus the role played by X in the optimal solution 
differs basically from that of a: andy. T While it is harmless to treat X as just another choice 
variable in the discussion of the first-order condition, we must be careful not to apply 
blindly the second-order conditions developed for the free-extremum problem to the pre¬ 
sent constrained case. Rather, we must derive a set of new ones. As we shall see, the new 
conditions can again be stated in terms of the second-order total differential d 2 z, However, 
the presence of the constraint will entail certain significant modifications of the criterion. 


Second-Order Total Differential 

It has been mentioned that, inasmuch as the constraint g(.v, y) = c. means dg = g x dx + 
g v dy = 0, as in (12.10), dx and dy no longer are both arbitrary. We may, of course, still take 
(say) dx as an arbitrary change, but (hen dy must be regarded as dependent on dx , always lo 
be chosen so as to satisfy (12.10), i.e., to satisfy dy - -(g/g.) dx. Viewed differently, 
once the value of dx is specified, dy will depend on g., and g v , but since the latter deriva¬ 
tives in turn depend on the variables x and y, dy will also depend on x and y. Obviously, 
then, the earlier formula for d 2 z in (11.6), which is based on the arbitrariness of both dx and 
dy, can no longer apply. 

To find an appropriate new expression for d 2 z, we must treat dy as a variable dependent 
on x and v during differentiation (if dx is to be considered a constant). Thus, 


12 . 


„ , s Kd:) , , 3(*) 

= d{dz) - —— dx + —— dy 
Sx dy 

- (/ t dx 4- /, dy)dx + ^ (f x dx + f, dy) dy 

dx dy 


, tidy 


I xx dx H- [ j y v dy + j i , 


)x 


dx 


f V X dx + (fyy dy + fy 


- f xl dx 2 + f :y dy dx -f j\ dx + dx dy + f yy dy 2 + f y dy 

OX 


dy 

[d) 

i)v 


dy 


f In a more general framework of constrained optimization known as "nonlinear programming/' 
to be discussed in Chap. 13, it will be shown that, with inequality constraints, if Z* is a maximum 
(minimum) with respect to x and y, then it will in fact be a minimum (maximum) with respect to X. 
tn other words, the point ( X *, x* f y*) is a saddle point. The present case—where Z* is a genuine 
extremum with respect to x and y, but is invariant with respect to X—may be considered as a 
degenerate case of the saddle point. The saddle-point nature of the solution (X*, x\ y Y ) also leads to 
the important concept of "duality." But this subject is best to be pursued later. 
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Since the third and the sixth terms can be reduced to 


fv 


8 (My) 

Bx 


dx 


Bjdy) 

By 


dy 


- f y d(dy) - f Y d 2 y 


the desired expression for d 2 z is 

dh = t\ x dx 1 + 2/ t , dx dy + f yy dy 1 + /, d 2 y (12.17) 

which differs from (11.6) only by the last term, f y d 2 y. 

It should be noted that this last term is in the first degree [d 2 y is not the same as {dy) 2 J; 
thus its presence in (12.17) disqualifies d 2 z as a quadratic form. However, d 2 z can be trans¬ 
formed into a quadratic form by virtue of the constraint g(x , > ) = c. Since the constraint 
implies dg = U and also d 2 g = d{dg) = 0, then by the procedure used in obtaining (12.17) 
we can get 


(d 2 g =)g x . r dx 2 + 2g ry dx dy + g yy dy 2 + g y d 2 y = 0 

Solving this last equation for d 2 y and substituting the result in (12.17), we are able to elim¬ 
inate the first-degree expression d 2 y and write d 2 z as the following quadratic form: 


d 2 z - ifxx ~ —gxt]dx 2 + 2 ! - —gf\dxdy+ (f vy - — g y }\ dy 2 


gv 


gv 


Because of (12.1 T), the first parenthetical coefficient is reducible to ( f xx - A£ u ), and 
similarly for the other terms. However, by partially differentiating the derivatives in (12.8). 
you will find that the following second derivatives 


Z>xx — fxx - kgxx 

%xy — fxy ~ ^Sxy — %vx 

Zyv “ Jyy ” ^8yv 


(12.18) 


are precisely equal to these parenthetical coefficients. Hcncc, by making use of the La- 
grangian function, we can finally express d 2 z more neatly as follows: 

jfc A 

d z = Z xx dx + Z XY dx dy 

+ 2 yx dy dx + Z V) dy 2 (12.17') 


The coefficients of (12.17') are simply the second partial derivatives of / with respect 
to the choice variables x and y\ together, therefore, they can give rise to a Hessian 
determinant. 


Second-Order Conditions 

For a constrained extremum ofz = f(x, y), subject to g(x, y) — c, the second-order nec- 
essary-and-sufficient conditions still revolve around the algebraic sign of the second-order 
total differential d 2 z, evaluated at a stationary point. However, there is one important 
change. In the present context, we are concerned with the sign definiteness or semidefi- 
niteness of d 2 z, no! for all possible values of dx and dy (not both zero), but only for those 
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dx and dy values (not both zero) satisfying the linear constraint (12.10). g x dx + g v dv = 0. 
Thus the second-order necessary conditions are 

For maximum of z: d 2 z negative semidefimte, subject to dg = 0 

For minimum ofz: d 2 z positive scmidcliniie, subject to dg = 0 

and the second-order sufficient conditions are 

For maximum ofz: d 2 z negative definite, subject to dg - 0 

For minimum of z: d 2 z positive definite, subject to dg = 0 

In the following, we shall concentrate on the second-order sufficient conditions. 

Inasmuch as the (dx, dy) pairs satisfying the constraint g, dx + g y dy = 0 constitute 
merely a subset of the set of all possible dx and dy, the constrained sign definiteness is less 
stringent—that is, easier to satisfy—than the unconstrained sign definiteness discussed in 
Chap. 11. In other words, the second-order sufficient condition for a constrained-extremum 
problem is a weaker condition than that for a free-extremum problem. This is welcome 
news because, unlike necessary conditions which must be stringent in order to serve as 
effective screening devices, sufficient conditions should be weak to be truly serviceable. 1 


The Bordered Hessian 

As in the case of free extremum, it is possible to express the second-order sufficient condi¬ 
tion in determinants! form. In place of the Ilessian determinant H\, however, in the 
constrained-extremum case we shall encounter what is known as a bordered Hessian. 

In preparation for the development of th is idea, let us first analyze the conditions for the 
sign definiteness of a two-variable quadratic form, subject to a linear constraint, say, 

q = aid -\-2huv + bv“ subject to au+fh> = i) 


Since the con straint implies v = —(a//f)u, we can rewrite q as a function of one variable only: 


q = au 1 - 2/i° u 1 + b^u 2 = (afi 2 - 2haji + ba 2 ) — 


P 2 


fi¬ 


ll is obvious that q is positive (negative) definite if and only if the expression in parenthe¬ 
ses is positive (negative). Now, it so happens that the following symmetric determinant 


0 a p 

a a h 

fS h b 


2 hap - afi 1 - ba 1 


is exactly the negative of the said parenthetical expression. Consequently, we can state that 


q is 


positive definite 
negative definite 


subject to au + pv = 0 


iff 


0 

a 

p 


a 

a 

h 


P 

h 

b 



< 0 
> 0 


' "A million-dollar bank deposit" is dearly a sufficient condition for "being able to afford a steak 
dinner." But the extremely limited applicability of that condition renders it practically useless. A more 
meaningful sufficient condition might be something like "fifty dollars in one's wallet," which is a 
much less stringent financial requirement. 
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Example 1 


It is noteworthy that the determinant used in this criterion is nothing but the discriminant 

a ^ ', with a border placed on lop and a similar border on 


of the original quadratic form 


h h 


the left, Furthermore, the border is merely composed of the two coefficients a and from the 
constraint, plus a zero in the principal diagonal. This bordered discriminant is symmetric. 


Determine whether q = 4u 2 + 4uv + 3v 2 , subject to u- 2v = 0, is either positive or 


negative definite, We first form the bordered discriminant 


0 1 -2 
1 4 2 

-2 2 3 


which is made 


symmetric by splitting the coefficient of uv into two equal parts for insertion into the deter¬ 
minant. Inasmuch as the determinant has a negative value (-27), q must be positive 
definite. 


When applied to the quadratic form d 2 z in (12.17'j, the variables u and v become dx 

Z Z 

and dy, respectively, and the (plain) discriminant consists of the Hessian “ . 

Moreover, the constraint to the quadratic form being g x dx + g v dy = 0, we have a — g x and 
fi = g y . Thus, for values of dx and dy that satisfy the said constraint, we now have the fol¬ 
lowing determinantal criterion for the sign definiteness of d^z: 


d 2 


positive definite , 

z is { r . \ subject to dg = 0 

[ negative definite 


iff 


0 gx ifl 

gx Z„ Z,I 

gy Zy X Z,, 


<0 
> 0 


The determinant to the right, often referred to as a bordered Hessian, shall be denoted by 
| //1, where the bar on top symbolizes the border. On the basis of this, we may conclude that, 
given a stationary value ofz = f(x, y) or of Z = f(x, y) + k[c - g(x, /)], a positive \H\ 
is sufficient to establish it as a relative maximum ofz; similarly, a negative \ H\is sufficient 
to establish it as a minimum—all the derivatives involved in \H \ being evaluated at the crit¬ 
ical values ofz and y. 

Now that we have derived the second-order sufficient condition, it is an easy matter 
to verify that, as earlier claimed, the satisfaction of this condition will guarantee that the 
endogenous-variable Jacobian (12.12) does not vanish in the optimal state. Substituting 
(12.18) into (12.12), and multiplying both the first column and the first row of the Jacobian 
by -1 (which will leave the value of the determinant unaltered), we see that 


1/1 = 


0 gx gy 

g\ Z.xx Z x , 

gv Zvx Zvv 


= \H\ 


(12.19) 


That is, the endogenous-variable Jacobian is identical with the bordered Hessian—a result 
similar to (11.42) where it was shown that, in the free-extremum context, the endogenous- 
variable Jacobian is identical with the plain Hessian, if, in fulfillment of the sufficient 
condition, we have \H\ ^ 0 at the optimum, then \ J\ must also be nonzero. Consequently, 
in applying the implicit-function theorem to the present context, it would not be amiss to 
substitute the condition |//| ^ 0 for the usual condition |J| 0. This practice will be fol¬ 

lowed when we analyze the comparative statics of constrained-optimization problems in 
See. 12.5. 
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Example 2 


Example 3 


Example 4 


Let us now return to Example 1 of Sec. 12.2 and ascertain whether the stationary value 
found there gives a maximum or a minimum. Since Z„ = y - X and Z y = x- a, the second- 
order partial derivatives are Z„ = 0, Z, y = Z yx = 1, and Z yy = 0. The border elements we 
need are g x = 1 and g y = 1. Thus we find that 


Hi 


0 1 1 
1 0 1 
1 1 0 


2 > 0 


which establishes the value t - 9 as a maximum. 


Continuing on to Example 2 of Sec. 12.2, we see that Z\ = 2xi - A and li = 2x 2 - Ax. 
These yield Zu = 2, Z 12 = Z 2 i = 0, and Z 22 = 2. From the constraint xi +4x 2 = 2, we ob¬ 
tain g] = 1 and g 2 = 4. It follows that the bordered Hessian is 


|H| 


0 1 4 
1 2 0 
4 0 2 


-34 <0 


and the value /* = ^ is a minimum. 


Consider a simple two-period model where a consumer's utility is a function of consump¬ 
tion in both periods. Let the consumer's utility function be 

0(x 1 ,x 2 ) = x 1 x 2 


where xi is consumption in period 1 and x 2 is consumption in period 2. The consumer is 
also endowed with a budget B at the beginning of period 1. 

Let r denote a market interest rate at which the consumer can choose to borrow or lend 
across the two periods. The consumer's intertemporal budget constraint is that xi and the 
present value of x 2 add up to 6. Thus, 


The Lagrangian for this utility maximization problem is 

/ x 2 

Z = Xi x 2 + A I B - Xi - — - 

with first-order conditions 


9 Z 
9 k 


= B - xi - 


*2 




1 + 


n_ 

<3xi 


= x 2 — X = 0 


SZ_ _ _ i_ 

9x 2 -Xl 1 +r 


= 0 


Combining the last two first-order equations to eliminate X gives us 

« = = i+ r 

Xi A/(1 -f r) 

Substituting this equation into the budget constraint then yields the solution 


x 


1_ 2 


and 


x? = 


8(1 +r) 


2 
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Next, we should cheek the second-order sufficient condition for a maximum. The bor¬ 
dered Hessian for this problem is 


IH 


0 -1 
-1 0 


1 

TT7 

i 

o 


2 


1 + 


Thus the second-order sufficient condition is satisfied for a maximum U. 


n-Variable Case 

When the objective function takes the form 

z = f(xi,X 2 . x„) subject to g(*i, xj,..., x„) = c 

the second-order condition, still hinges on the sign of d 2 z. Since the latter is a constrained 
quadratic form in the variables dx],dx 2 ,..., dx„. subject to the relation 

(dg =)g\ dx i + gi dx 2 H-+ g„ dx„ = 0 

the conditions for the positive or negative definiteness of d 2 z again involve a bordered 
Hessian. But this time these conditions must be expressed in terms of the bordered leading 
principal minors of the Hessian. 

Given a bordered Hessian 


\m = 


0 

gl 

g2 - 

'• g” 

gi 

Z\\ 

Z12 * 

• • Z\„ 

g2 

Z 21 

Z 22 * 

• • Zjn 

gn 

Z„i 

Zfl2 ' ' 

■■ Z nn 


its bordered leading principal minors can be defined as 



82 


g i 

Zi, 

Z 2 , 


g2 
Z12 
Z 22 


0 

gi 

S2 

gy 

gi 

z 11 

z 12 

Z\2 

g2 

Z 21 

Z 22 

Z 22 

g2 

Z 31 

Zzi 

Z} 3 


(etc.) 


with the last one being \H„\ = |//|. In the newly introduced symbols, the horizontal bar 
above // again means bordered, and the subscript indicates the order of the leading princi¬ 
pal minor being bordered, for instance, | H 2 1 involves the second leading principal minor of 
the (plain) Hessian, bordered with 0. gi. and g 2 ; and similarly for the others, The conditions 
for positive and negative definiteness of d 2 z are then 


d 2 z is 


positive definite 
negative definite 


subject to dg = 0 


iff 


h 2 \,\h z \,...ah„\ <0 

H 2 \ > 0; \H 2 \ < 0: \H A \ > 0: etc. 


In the former, all the bordered leading principal minors, starting with \ H 2 \, must be nega¬ 
tive; in the latter, they must alternate in sign. As previously, a positive definite d 2 z is 
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TABLE 12.1 DeterminaiHalTest for Relative Constrained Extremum: z =J\x,,x 1 ,.. 

Subject toffjjr,, *,..vj = c; with Z =f(x„x 2 , ...,x n ) + \fc- g(x„x v x„)] 

Condition Maximum Minimum 

First-order necessary condition Z\ = Z\ = li = • - ■ = Z n — 0 Z>. = Z\ = li = • • - = Z n = 0 

Second-order sufficient condition^" IH 2 I > 0; |Hj| < 0; __ IW 2 U// 3 I— ,\H„\ < 0 

|H 4 |>0;...;<-1)"|H n |>Q 


1 Applicable only after the firet-order necessary condition has been satisfied. 


sufficient to establish a stationary value of z as its minimum, whereas a negative definite 
ci 2 z is sufficient to establish it as a maximum. 

Drawing the threads of the discussion together, we may summarize the conditions for a 
constrained relative extremum in Tabic 12.1. You will recognize, however, that the criterion 
stated in the table is not complete. Because the second-order sufficient condition is not nec¬ 
essary, failure to satisfy the criteria staled docs not preclude the possibility that the station¬ 
ary value is nonetheless a maximum or a minimum as the case may be. In many economic 
applications, however, this (relatively less stringent) second-order sufficient condition is 
either satisfied, or assumed to be satisfied, so that the information in the tabic is adequate. 
It should prove instructive for you to compare the results contained in Table 12.1 with those 
in Table 11.2 for the free-extremum case. 

Multiconstraint Case 

When more than one constraint appears in the problem, the second-order condition in¬ 
volves a Hessian with more chan one border. Suppose that there arc « choice variables 
and fti constraints (m < n) of the form g'(X],.... x„) = c,-. Then the Lagrangian function 
will be 


Z = fix 1 ,..., x„) + Yj X )[ty “ s j (*1 . x »)1 

./=] 

and the bordered Hessian will appear as 


0 

0 

... 0 

s\ 


si 

0 

0 

... 0 

p 

S\ 

si ■■■ 

T 

S'n 

0 

0 

... 0 

j *r 

g ■■■ 

£ 


si 

■■■ 


Z\2 ■■■ 



si 

■■■ Si 

Z 21 

7.12 - ■ 

Z2>< 

si 

si 

■ 

Zfl 1 

Z/;2 ‘ ‘ ‘ 



where gl = (Jg'/cf.r, are the partial derivatives of the constraint functions, and the double- 
subscripted Z symbols denote, as before, the second-order partial derivatives of the 
Lagrangian function, Note that we have partitioned the bordered Hessian into four areas 
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for visual clarity. The upper-left area consists of zeros only, and the lower-right area is sim¬ 
ply the plain Hessian. The other two areas, containing the g; derivatives, bear a mirror- 
image relationship to each other with reference to the principal diagonal, thereby resulting 
in a symmetric array of elements in the entire bordered Hessian. 

Various bordered leading principal minors can be formed from \H\. The one that eon- 
tains Z22 as the last element of its principal diagonal may be denoted by | |. as before. By 

including one more row and one more column, so that Z33 enters into the scene, wc will 
have \Hn\, and so forth. With this symbology, we can state the second-order sufficient con¬ 
dition in terms of the signs of the following (n - m) bordered leading principal minors: 

■ Mm+i|= I/W2I, |M,|(= |/ 7 |) 

For a maximum of z, a sufficient condition is that these bordered leading principal minors 
alternate in sign, the sign of ■//„, n | being that of (— l) m + l . For a minimum ofr, a sufficient 
condition is that these principal minors all take the same sign, namely, that of (- 1 )"'. 

Note that it makes an important difference whether we have an odd or even number of 
constraints, because {- 1 ) raised to an odd power will yield the opposite sign to the case of 
an even power. Note, also, that when m = 1 , the condition just stated reduces to that pre¬ 
sented in Table 12 . 1 , 


EXERCISE 12.3 

1. Use the bordered Hessian to determine whether the stationary value of z obtained in 
each part of Exercise 12.2-1 is a maximum or a minimum. 

2. In stating the second-order sufficient conditions for constrained maximum and mini¬ 
mum, we specified the algebraic signs of IH 2 L 1 H 3 1, |H 4 |, etc., but not of [Hj|. Write out 
an appropriate expression for | H]\, and verify that it invariably takes the negative sign. 

3. Recalling Property II of determinants (Sec. 5.3), show that: 

(a) By appropriately interchanging two rows and/or two columns of IH 2 I and duly 
altering the sign of the determinant after each interchange, it can be transformed 


into 



Zn 

Z12 

91 

Z21 

Z22 

92 

gi 

92 

0 


(b) By a similar procedure, IH 3 I can be transformed into 

Z 11 Z 12 Zn 9i 
Z 21 Z 22 Z 2 J 02 

Z 3 I Z 32 Z 33 03 

9i 92 9l 0 

What alternative way of "bordering" the principal minors of the Hessian do these 
results suggest? 

4 . Write out the bordered Hessian for a constrained optimization problem with four 
choice variables and two constraints. Then state specifically the second-order sufficient 
condition for a maximum and for a minimum of z, respectively. 
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12.4 Quasiconcavity and Quasiconvexity _ 

In Sec. 11.5 it was shown that, for a problem of free extremum, a knowledge of the con¬ 
cavity or convexity of the objective function obviates the need to chock the second-order 
condition. In the context of constrained optimization, it is again possible to dispense with 
the second-order condition if the surface or hypersurfacc has the appropriate type of con¬ 
figuration. But this time the desired configuration is quasiconcavity (rather than concavity) 
for a maximum, and quasiconvexity (rather than convexity) for a minimum. As we shall 
demonstrate, quasiconcavity (quasiconvexity) is a weaker condition than concavity (con¬ 
vexity). This is only to be expected, since the second-order sufficient condition to bo dis¬ 
pensed with is also weaker for the constrained optimization problem (d 2 z definite in sign 
only for those dx, satisfying = 0) than for the free one (d 2 z definite in sign for all clx ,). 

Geometric Characterization 

Quasi concavity and quasi convexity, like concavity and convexity, can be either strict or 
nonstricl. We shall first present the geometric characterization of these concepts: 

Let u and v be any two distinct points in the domain (a convex set) of a function/ and let 
line segment uv in the domain give rise to arc MN on the graph of the function, such that 
point /V is higher than or equal in height to point M. Then function /is said to be quo won- 
cave (quasiconvex) if all points on are MN other than M and A r arc higher than or equal in 
height to point M (lower than or equal in height to point AO. The function / is said to be 
sirictlv quasiconcaw (.strictly quasiconvex) if all the points on arc MN other than M and N 
are strictly higher than point M (strictly lower than point A ). 

It should be clear from this that any strictly quasi concave (strictly quasiconvex) function is 
quasiconcave (quasiconvex). but the converse is not true. 

For a better grasp, let us examine the illustrations in Fig. 12.3, all drawn for the one- 
variable case. In Fig. 12.3d, line segment u i s in the domain gives rise to arc MN on the curve 
such that A r is higher than M. Since all the points between M and N on the said arc arc 
strictly higher than M, this particular arc satisfies the condition for strict quasieoncavity. 
For the curve to qualify as strictly quasiconcave, however, all possible (u. v) pairs must have 
arcs that satisfy the same condition. This is indeed the case for the function in Fig. 12.3a. 
Note that this function also satisfies the condition for (nonstrict) quasiconcavity. But it fails 
the condition for quasiconvcxity, because some points on arc MN are higher than N, which 
is forbidden for a quasiconvex function. The function in Tig. 12.3ft has the opposite con¬ 
figuration. All the points on arc M'N' are lower than N\ the higher of the two ends, and 
the same is true of all arcs that can be drawn. Thus the function in Fig. 12.3ft is strictly 
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FIGURE 12.4 



quasiconvex. As you can verify, it also satisfies the condition for (nonstrict) quasiconvex¬ 
ity, but fails the condition for quasiconcavity. What distinguishes Fig. 12.3c is the presence 
of a horizontal line segment M"N", where all the points have the same height. As a result, 
that line segment—and hcncc the entire curve—can only meet the condition for quasicon¬ 
cavity, but not strict quasiconcavity. 

Generally speaking, a quasiconcave function that is not also concave has a graph roughly 
shaped like a bell, or a portion thereof, and a quasiconvex function has a graph shaped like an 
inverted bell, or a portion thereof. On the bell, it is admissible (though not required) to have 
both concave and convex segments. This more permissive nature of the characterization 
makes quasiconcavity (quasiconvexity) a weaker condition than concavity (convexity). In 
Fig. 12.4, we contrast strict concavity against strict quasiconcavity for the two-variable case. 
As drawn, both surfaces depict increasing functions, as they contain oniy the ascending por¬ 
tions of a dome and a bell, respectively. The surface in Fig. 12.4a is strictly concave, but the 
one in Fig. 12.46 is certainly not, since it contains convex portions near the base of the bell. 
Yet it is strictly quasiconcavc; all the arcs on the surface, exemplified by Wand M'N\ sat¬ 
isfy the condition that all the points on each arc between the two end points are higher than the 
lower end point. Returning to Fig. 12.4a, we should note that the surface therein is also strictly 
quasiconcave. Although we have not drawn any illustrative arcs MN and M'N' in Fig. 12.4a, 
it is not difficult to check that all possible arcs do indeed satisfy the condition for strict quasi¬ 
concavity. In general, a strictly concave function must be strictly quasiconcavc, although the 
converse is not true. We shall demonstrate this more formally in the paragraphs that follow. 


Algebraic Definition 

The preceding geometric characterization can be translated into an algebraic definition for 
easier generalization to higher-dimensional cases: 


A function /is 


quasiconcave 

quasiconvex 


iff*, for any pair of distinct points u and v in the (convex-set) 


domain off and for 0 < 0 < 1, 


/(«) >/(«)=>/[*« +(l 


>/(«) 
< /(>') 


( 12 . 20 ) 
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To adapt this definition To strict quasiconcavity and quasiconvexity, the two weak inequal¬ 


ities on the right should be changed into strict inequalities 


> /(«) 

< /(f) 


. You may find it 


instructive to compare (12.20) with (11.20). 

From this definition, the following three theorems readily follow. These will be stated 
in terms of a function f(x), where x can be interpreted as a vector of variables, 


.t =(.vi,....*„). 


Theorem I (negative of a function) If f(x) is quasiconcave (strictly quasiconcave), 
then -/(*) is quasiconvex (strictly quasiconvex). 

Theorem II (concavity versus quasiconcavity) Any concave (convex) function is qua¬ 
siconcave (quasiconvex), but the converse is not true. Similarly, any strictly concave 
(strictly convex) function is strictly quasiconcave (strictly quasiconvex), but the converse is 
not true. 


Theorem III (linear function) If f{x) is a linear function, then it is quasiconcave as 
well as quasiconvex. 

Theorem I follows from the fact that multiplying an inequality by - L reverses the sense 
of inequality. Let f(x) be quasiconcave, with /(f) > f(u). Then, by (12.20), f[du 4- 
(1 -0)u] > /(m). As far as the function - f(x) is concerned, however, we have (after mul¬ 
tiplying the two inequalities through by -1) - /(«) > -/(f) and -f[Bu + (1 - 0 ) 1 '] < 
-f{n). Interpreting ~f(u) as the height of point N, and-/(f) as the height of M, we see 
that the function -f(x) satisfies the condition for quasiconvexity in (12.20). This proves 
one of the four cases cited in Theorem I; the proofs for the other three are similar. 

For Theorem 11, we shall only prove that concavity implies quasiconcavity. Let / U) be 
concave. Then, by (11.20), 

/[0« + (i-e)u]>0/(u)+(l-e)/(«) 

Now assume that/(f) > /(h); then any weighted average of f(v) and flu) cannot possi¬ 
bly be less than /(u), i.c., 

Combining these two results, we find that, by transitivity, 

f[Bu + (1 - d)v] > f(u) for /(?;) > /(«) 

which satisfies the definition of quasiconcavity in (12.20). Note, however, that the condi¬ 
tion for quasiconcavity cannot guarantee concavity. 

Once Theorem II is established, Theorem III follows immediately, We already know that 
a linear function is both concave and convex, though not strictly so. In view of Theorem II, 
a linear function must also be both quasiconcave and quasiconvex, though not strictly so. 

In the case of concave and convex functions, there is a useful theorem to the effect 
that the sum of concave (convex) functions is also concave (convex). Unfortunately, 
this theorem cannot be generalized to quasiconcave and quasiconvex functions. For 
instance, a sum of two quasiconcave functions is not necessarily quasiconcave (see 
Exercise 12.4-3). 
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FIGURE 12.5 





<«> (b) (r) 


Sometimes it may prove easier to check quasiconcavity and quasiconvexity by the fol¬ 
lowing alternative definition: 


A function f(x), where .x is a vector of variables, is 
the set 


quasiconcave 

quasiconvex 


iff, for any constant k, 


5- = {x | fix) > k\ 
^ (.x | /(*) < A) 


is a convex set 


( 12 , 21 ) 


The sets S- and S-, which are subsets of the domain, were introduced earlier (Fig. 11.10) to 
show that a convex function (or even a concave function) can give rise to a convex set. Here 
we are employing these two sets as tests for quasiconcavity and quasiconvexity. The three 
functions in Fig. 12.5 all contain concave as well as convex segments and hence are neither 
convex nor concave. But the function in Fig. 12.5a is quasiconcave, because for any value of 
k (only one of which has been illustrated), the set S- is convex. The function in Fig. 12.5/) is, 
on the other hand, quasiconvex since the set S- is convex. The function in Fig. 12.5c—a 
monotonic function—differs from the other two in that both S- and 5- are convex sets. 
Hence that function is quasiconcave as well as quasiconvex. 

Note that while (12.21) can be used to check quasiconcavity and quasiconvexity, it is 
incapable of distinguishing between strict and nonstrict varieties of these properties. 
Note, also, that the defining properties in (12.21) are in themselves not sufficient for 
concavity and convexity, respectively. In particular, given a concave function which must 
perforce be quasiconcave, wc can conclude that S- is a convex set; but given that 5- is a 
convex set, we can conclude only that the function/is quasiconcmt (but not necessarily 
concave). 

Example 1 Check z = x 2 (x > 0) for quasiconcavity and quasiconvexity. This function is easily verified 

- geometrically to be convex, in fact strictly so. Hence it is quasiconvex. Interestingly, it is also 

quasiconcave. For its graph—the right half of a U-shaped curve, initiating from the point of 
origin and increasing at an increasing rate—is, similarly to Fig. 12.5c, capable of generating 
a convex 5- as well as a convex S'\ 

If we wish to apply (12.20) instead, we first let u and v be any two distinct nonnegative 
values of x. Then 


f(u) = u 2 f(v) = v 2 and f[0u + (1 - 6)v] = [fhi + (1 - 0 )v] 2 
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Example 2 


Example 3 


Suppose that f(v) > f(u), that is, v 2 > u 2 ; then v>u, or more specifically, v > u (since u 
and v are distinct). Inasmuch as the weighted average [Qu + (1 - tf)v] must lie between 
u and v, we may write the continuous inequality 

v z > [f)u + (\-9)v] 2 > u 2 forQ<0<1 

or f(v) > f[$w + (1 - fl)v] > f(u) for 0 < 6 < 1 

By (12.20), this result makes the function f both quasiconcave and quasiconvex—indeed 
strictly so. 

Show that i - f(x, y) = xy (with x, y > 0) is quasiconcave. We shall use the criterion in 
(12.21) and establish that the set 5- = ((x, y) \ xy > k) is a convex set for any k. For this 
purpose, we set xy = k to obtain an isovalue curve for each value of k. Like x and y, k should 
be nonnegative. In case k > 0, the isovalue curve is a rectangular hyperbola in the first 
quadrant of the xy plane. The set 5-, consisting of all the points on or above a rectangular 
hyperbola, is a convex set. In the other case, with k = 0, the isovalue curve as defined by 
xy = 0 is L-shaped, with the L coinciding with the nonnegative segments of the x and y 
axes. The set consisting this time of the entire nonnegative quadrant, is again a convex 
set. Thus, by (12.21), the function 2 =xy (with x, y > 0) is quasiconcave. 

You should be careful not to confuse the shape of the isovalue curves xy = k (which is 
defined in the xy plane) with the shape of the surface z=xy (which is defined in the xyz 
space). The characteristic of the 2 surface (quasiconcave in 3-space) is what we wish to as¬ 
certain; the shape of the isovalue curves (convex in 2 -space for positive k) is of interest here 
only as a means to delineate the sets S-- in order to apply the criterion in (12.21). 


Show that z=f(x,y) = (x-a) 2 +(y-b) 2 is quasiconvex. Let us again apply (12.21). 
Setting (x - a) 2 + (y — b) 2 = k, we see that k must be nonnegative. For each k , the iso¬ 
value curve is a circle in the xy plane with its center at ( 0 , b ) and with radius A- Since 
S- = |(x, y) | (x - a) 2 + (y - b) 2 < k) is the set of all points on or inside a circle, it consti¬ 
tutes a convex set. This is true even when k = 0—when the circle degenerates into a single 
point, ( 0 , b )—since by convention a single point is considered as a convex set. Thus the 
given function is quasiconvex. 

Differentiable Functions 

The definitions (12.20) and (12.21) do not require differentiability of the function/.' If/is 
differentiable, however, quasiconcavity and quasiconvexity can alternatively be defined in 
terms of its first derivatives; 


A differentiable function of one variable, f(x), is 
distinct pointy u and v in the domain, 


quasiconcave * 

. > 111, for any pair of 

quasiconvex I 


m > /(».) =*• 


f'(u)(v - u ) 

/'(u)(a — u) 


> 0 


( 12 . 22 ) 


Quasiconcavity and quasiconvexity will be strict, if the weak inequality on the right is 
changed to the strict inequality > 0. When there are two or more independent variables, the 
definition is to be modified as follows: 


A differentiable function /(xi,..., x n ) is 


quasiconcave 

quasiconvex 


iff, for any two distinct points 
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u = (u i,..and v = (vu .. v n ) in the domain, 


mo > zoo =► 


n 

Y,fj{u)(Vj “«/) 

j -1 

/=! 


> 0 


where fj s 3//3* y , to be evaluated at u or v as the case may be. 


( 12 . 22 ') 


Again, for strict quasiconcavity and quasi convexity, the weak inequality on the right should 
be changed to the strict inequality > 0. 

Finally, if a function z = f(x ),..., x n ) is twice continuously differentiable, quasicon¬ 
cavity and quasiconvexity can be checked by means of the first and second partial deriva¬ 
tives of the function, arranged into the bordered determinant 



0 

,/t 

h fn 


/i 

fn 

fn ••• fn 

|3| = 

fi 

h\ 

hi ■ ■ ■ hn 


fn fn\ Jn2 • • • Jrirt 


(12.23) 


This bordered determinant resembles the bordered Hessian \H\ introduced in Sec. 12.3. 
But unlike the latter, the border in |£| is composed of the first derivatives of the function/ 
rather than an extraneous constraint function £. It is because \B\ depends exclusively on 
the derivatives of function/itself that we can use |£|, along with its leading principal 
minors 




0 f 

A fn 


B 2 \ 


0 h k 
f\ fn fn 
Ji h\ fn 


B n \ = \B\ (12.24) 


to characterize the configuration of that function. 

We shall stale here two conditions; one is necessary, and the other is sufficient. Both relate 
to quasiconcavity on a domain consisting only of the nonnegative orthani (the n-dimensional 
analog ofthe nonnegative quadrant), that is, with.ti, (J. + 

For 2 = f(x ],..to be quasiconcave on the nonnegative orthant, it is necessary that 

|Sil<0, |£ 2 |>0, |S„| ( - 10 if« is I ° dd (12.25) 

> even 


wherever the partial derivatives are evaluated in (he nonnegative orthant. 


r Whereas concavity (convexity) of a function on a convex domain can always be extended to 
concavity (convexity) over the entire space, quasiconcavity and quasiconvexity cannot. For 
instance, our conclusions in Examples 1 and 2 will not hold if the variables are allowed to take 
negative values. The two conditions given here are based on Kenneth |. Arrow and Alain C. 
Enthoven, "Quasi-Concave Programming," Econometrica, October 1961, p. 797 (Theorem 5), 
and Akira Takayama, Analytical Methods in Economics, University of Michigan Press, 1993, p. 65 
(Theorem 1.12). 
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Example 4 


A sufficient condition for/to be strictly quasiconcave on the nonnegativc orthant is that 

< 

> 

wherever the partial derivatives arc evaluated in the nonnegative orthant. 

Note that the condition |B,| < 0 in (12.25) is automatically satisfied because \B\ \ = -/ ( 2 ; 
it is listed here only for the sake of symmetry. So is the condition |Bi| <0 in (12.26). 


0 if« is 


odd 

even 


(12.26) 


Sil < 0, |fl 2 l>0, .... IB, 


The function z= f(x 1( x 2 ) = xi X 2 {xy,x 2 > 0) is quasiconcave (cf. Example 2). We shall now 
check this by (12.22 ). Let u = (to, u 2 ) and v-fa, v 2 ) be any two points in the domain. 
Then f{u) = ui u 2 and f(v) = v ] v 2 . Assume that 

f(y)>f(u) or viv 2 >uiu 2 (Vi, v 2 , ui,U 2 > 0) (12.27) 

Since the partial derivatives of fare h = x 2 and f 2 = xi, (12.22') amounts to the condition 
that 


fl (u)(V] - Ui) + f 2 (u)(v 2 - U 2 ) = U 2 ( Vi — Ui) 4- Ui(V 2 - u 2 ) > 0 
or, upon rearrangement, 


u 2 {v i - U]) > ui(u 2 - v 2 ) (12.28) 

We need to consider four possibilities regarding the values of u\ and u 2 . First, if ui = 
u 2 = 0, then (12.28) is trivially satisfied. Second, if m = 0 but u 2 > 0, then (12.28) reduces 
to the condition u 2 vi > 0, which is again satisfied since u 2 and vi are both nonnegative. 
Third, if U] > 0 and u 2 = 0, then (12.28) reduces to the condition 0 > -ui v 2 , which is still 
satisfied. Fourth and last, suppose that ui and u 2 are both positive, so that vi and v 2 are also 
positive. Subtracting v 2 u\ from both sides of (12.27), we obtain 

v 2 (V] -U\)>U](U2-v 2 ) (12.29) 


Three subpossibilities now present themselves: 

1. If u 2 = v 2 , then vi > Uu In fact, we should have v-\ > ui since (ui, u 2 ) and (v 1r v 2 ) are dis¬ 
tinct points. The fact that u 2 = v 2 and vi > ui implies that condition (12.28) is satisfied. 

2, If ui > v 2/ then we must also have v, > ui by (12.29). Multiplying both sides of (12.29) 
by u 2 /v 2 , we get 


U 2 (V| -Ui) > — Ul(u 2 - V 2 ) > Ui(Ul-V 2 ) 
V2 


since— > 1 
v 2 


(12.30) 


Thus (12.28) is again satisfied. 

3. The final subpossibility is that u 2 < v 2 , implying that u 2 /v 2 is a positive fraction. In 
this case, the first line of (12.30) still holds. The second line also holds, but now for a dif¬ 
ferent reason: a fraction (u 2 /v 2 ) of a negative number (u 2 - v 2 ) is greater than the latter 
number itself. 
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Inasmuch (12.28) is satisfied in every possible situation that can arise, the function 
z= *i *2 (*i «*2 > 0) is quasiconcave. Therefore, the necessary condition (12.25) should 
hold. Because the partial derivatives of fare 

f| = X2 h - *1 f}\ ~ f22 =0 f]2 = f21 = 1 

the relevant leading principal minors turn out to be 

0 x? , 0 x 2 x, 

Iflil = Q = -*2 < 0 Iflal — x 2 0 1 = 2*i X 2>0 

2 xi 1 0 

Thus (12.25) is indeed satisfied. Note, however, that the sufficient condition (12.26) is 

satisfied only over the positive orthant. 

Example 5 Show ^ at 1 = ^ ” xC V (*/ y > 0: 0 < a, b < 1) is strictly quasiconcave. The partial 

- derivatives of this function are 

u = ox° y f Y = txy- 1 

f /l ,t = a(a-‘\)x°- 2 y b f xy = f yK = abx°~'' / ' = b{b- 1 )xV“ 2 

Thus the leading principal minors of f S | have the following signs; 

l«,l= ? ! x = -(ox n_1 y b ) 2 < 0 

'X *xx 

0 f K f y 

ie 2 |= fx lx, fxy =[2 o 2 fe 2 -fl(fl-1)& 2 -o 2 f)(b-1)]x 3o -V' , " 2 >0 

fy fyx fyy 

This satisfies the sufficient condition for strict quasiconcavity in (12.26). 

A Further Look at the Bordered Hessian 

The bordered determinant |S|. as defined in (12.23), differs from the bordered Hessian 


0 

g i 

ft . 

•• ft 

ft 

Z,, 

Zl2 • 

• • Z, fl 

H\= ft 

z 2 , 

Zn , 

• « Zln 

ft 

Z„i 

Znl • 

7 

*-nn 


in two ways: (1) the border elements in |S| arc the first-order partial derivatives of function 
/ rather than g; and (2) the remaining elements in | B\ are the second-order partial deriva¬ 
tives off rather than the Lagrangian function Z. However, in the special case of a linear 
constraint equation, g(x[ 9 . .., x n ) = a\X] H-h a n x n = e—a case frequently encoun¬ 

tered in economics (see Sec. 12.5)—Z/, reduces to /i ; . For then the Lagrangian function is 

Z = f(x \ t ..., x n ) 4- k(c - a\X[ - a n x n ) 

so that 


Zy = f - kOj 


and Ztj - fij 
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Turning to the borders, we note that the linear constraint function yields the first derivative 
gj — a r Moreover, when the first-order condition is satisfied, we have Zj_= / - AOj — 0, 
so that fj = ketj, or I) =Agj. Thus the border in \H\ is simply that of |ff| multiplied by a 
positive scalar A.. By factoring out X successively from the horizontal and vertical borders 
of |ff| (see Sec. 5.3, Example 5), we have 

|S| 

Consequently, in the linear-constraint case, the two bordered determinants always possess 
the same sign at the stationary point of Z. By the same token, the leading principal minors 
\Bj\ and Iff | (i = 1, - -..«) must also share the same sign at that point. It then follows that 
if the bordered determinant |£| satisfies the sufficient condition for strict quasiconcavity in 
(12.26), the bordered Hessian |ff | must then satisfy the second-order sufficient condition 
for constrained maximization in Table 12.1. 

Absolute versus Relative Extrema 

A more comprehensive picture of the relationship between quasiconcavity and second- 
order conditions is presented in Fig. 12.6. (A suitable modification will adapt the figure for 
quasiconvexity.) Constructed in the same spirit—and to be read in the same manner—as 
Fig. 11.5, this figure relates quasiconcavity to absolute as well as relative constrained max¬ 
ima of a twice-differentiable function z = f(x\ __ The three ovals in the upper part 

summarize the first- and second-order conditions for a relative constrained maximum. And 
the rectangles in the middle column, like those in Fig. 11.5, tie the concepts of relative 
maximum, absolute maximum, and unique absolute maximum to one another. 

But the really interesting information are those in the two diamonds and the elongated 
=> symbols passing through them. The one on the left tells us that, once the first-order con¬ 
dition is satisfied, and if the two provisos listed in the diamond are also satisfied, we have a 
sufficient condition for an absolute constrained maximum, The first proviso is that the func¬ 
tion/ be explicitly quasiconcave—a new term which we must hasten to define. 

A quasiconcave function / is explicitly quasiconcave if it has the farther property that 

/('•') > /(«) => f[9u + (l-0)v] > /(«) 

This defining property means that whenever a point on the surface, /(u), is higher than an¬ 
other, /(u), then all the intermediate points—the points on the surface lying directly above 
line segment uv in the domain—must also be higher than /(»). What such a stipulation 
does is to rule out any horizontal plane segments on the surface except for a plateau at the 
top of the surface. 1 Note that the condition for explicit quasiconcavity is not as strong as the 
condition for strict quasiconcavity, since the latter requires f[6u + (1 - 0)i>] > f(u) even 
for /(()) = /(«), implying that nonhorizontal plane segments are ruled out, too.* The other 


f Let the surface contain a horizontal plane segment P such that f(u) e P and f(v) ? P. Then those 
intermediate points that are located on P will be of equal height to f(u), thereby violating the first 
proviso. 

t Let the surface contain a slanted plane segment P' such that f{u) = i(v) are both located on P'. 
Then all the intermediate points will also be on P' and be of equal height to f(u), thereby violating 
the cited requirement for strict quasiconcavity. 
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FIGURE 12.6 



proviso in the left-side diamond is that the set {(.t]_, x„) \ g(x \,..., x n ) = c} be con¬ 

vex. When both provisos are met. we shall be dealing with that portion of a bell-shaped, 
horizontal-segment-free surface {or hypersurface) lying directly above a convex set in the 
domain. A local maximum found on such a subset of the surface must be an absolute con¬ 
strained maximum. 

The diamond on the right in Fig. 12.6 involves the stronger condition of ;strict quasicon¬ 
cavity. A strictly quasiconcave function must be explicitly quasiconcave, although the con¬ 
verse is not true. Hence, when strict quasiconcavity replaces explicit quasiconcavity, an 
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absolute constrained maximum is still ensured. But this time that absolute constrained 
maximum must also be unique, since the absence of any plane segment anywhere on the 
surface decidedly precludes the possibility of multiple constrained maxima. 


EXERCISE 12.4 

1. Draw a strictly quasiconcave curve z = f(x) which is 
(o) also quasiconvex (d) not concave 

(b) not quasiconvex (e) neither concave nor convex 

(c) not convex ( f ) both concave and convex 

2. Are the following functions quasiconcave? Strictly so? First check graphically, and then 
algebraically by (12.20). Assume that x > 0. 

(a) f(x) = a (b) f(x) = a + bx (b > 0) (c) f(x) = a + cx 2 (c < 0) 

3. (a) Let z = f(x) plot as a negatively sloped curve shaped like the right half of a bell in 

the first quadrant, passing through the points (0, 5), (2, 4), (3, 2), and (5,1). Let 
2 = g(x) plot as a positively sloped 45 line. Are f(x ) and g(x) quasiconcave? 

(b) Now plot the sum f(x) -+- g(x). Is the sum function quasiconcave? 

4. By examining their graphs, and using {12.21), check whether the following functions 
are quasiconcave, quasiconvex, both, or neither: 

(a) f(x) = x l - 2x (b) f(x i, = 6*i - 9*2 (c) f(X], X 2 ) = X 2 - in xi 

5. ( 0 ) Verify that a cubic function z=ox i + bx 2 + cx + d is in general neither quasicon¬ 

cave nor quasiconvex. 

(b) Is it possible to impose restrictions on the parameters such that the function be¬ 
comes both quasiconcave and quasiconvex for x > 0? 

6 . Use {12.22) to check z = x 2 (x > 0) for quasiconcavity and quasiconvexity. 

7. Show that z = xy (x, y > 0) is not quasiconvex. 

8 . Use bordered determinants to check the following functions for quasiconcavity and 
quasiconvexity: 

{d)z=-x 2 -f (x, y > 0) (b) z=-(x+1) 2 -(y+2) 2 (x,y>0) 


12.5 Utility Maximization and Consumer Demand __ 

The maximization of a utility function was cited in Sec. 12.1 as an example of constrained 
optimization. Let us now reexamine this problem in mure detail. For simplicity, we shall 
still allow our hypothetical consumerthe choice of only two goods, both of which have con¬ 
tinuous, positive marginal-utility functions. The prices of both goods are market-deter¬ 
mined, hence exogenous, although in this section we shall omit the zero subscript from the 
price symbols. If the purchasing power of the consumer is a given amount B (for budget), 
the problem posed will be that of maximizing a smooth utility (index) function 

U = b\x,y) ((A, t/ v > 0) 


subject to 


xP s +yP v =B 
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First-Order Condition 

The Lagrangian function of this optimization model is 

Z = U(x,y) + X(B-xP x - yP y ) 

As the first-order condition, we have the following set of simultaneous equations: 


Z, = B-xP t -yP y = 0 

z, = U x - XP, = 0 (12.31) 

Z, - Uy ~ kPy = 0 

Since the last two equations are equivalent to 

U, U f 

T x =f y =x < 12 - 3V > 

the first-order condition in effect calls for the satisfaction of (12.3 T), subject to the budget 
constraint- the first equation in (12.31). What (12.3 T) states is merely the familiar proposi¬ 
tion in classical consumer theory that, in order to maximize utility, consumers must allocate 
their budgets so as to equalize the ratio of marginal utility to price for every commodity. 
Specifically, in the equilibrium or optimum, these ratios should have the common value A.’. 
As we learned earlier, A* measures the comparative-static effect of the constraint constant 
on the optimal value of the objective function. Hence, we have in the present context 
A* = that is, the optimal value of the Lagrange multiplier can be interpreted as 

the marginal utility of money (budget money) when the consumer's utility is maximized. 

If we restate the condition in (12.31') in the form 


Uy Py 


(12.31") 


the first-order condition can be given an alternative interpretation, in terms of indifference 
curves. 

An indifference curve is defined as the locus of the combinations of x and y that will 
yield a constant level of U. This means that on an indifference curve we must find 


dU = U x dx + Uy dy = 0 

with the implication that dy/dx = -U x /U v . Accordingly, if we piot an indifference 
curve in the xy plane, as in Fig. 12.7, its slope, dy/dx. must be equal to the negative of 
the marginal-utility ratio U x /U y . (Since we assume U x , U v > 0, the slope of the indiffer¬ 
ence curve must be negative.) Note that U x /U v , the negative of the indifiercncc-curve 
slope, is called the marginal rate of substitution between the two goods. 

What about the meaning of PJPfi As we shall presently see, this ratio represents the 
negative ot the slope of the graph of the budget constraint. The budget constraint, 
xP x +yP v = S, can be written alternatively as 

8 P x 
y Py P y X 

so that, when plotted in th zxy plane as in Fig. 12.7, it emerges as a straight line with slope 
-P x /P v (and vertical intercept B/P v ). 
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FIGURE 12.7 




In this light, the new version of the first-order condition—(12.31") plus the budget 
constraint—discloses that, to maximize utility, a consumer must allocate the budget such 
that the slope of the budget line (on which the consumer must remain) is equal to the slope 
of some indifference curve. This condition is met at point E in Fig. \2.1u, where the budget 
line is tangent to an indifference curve. 


Second-Order Condition 

If the bordered Hessian in the present problem is positive. i.e„ if 


\H\ = 


0 

Px 

Px 

Px 

Uxx 

Uxy 

Px 

Uyx 

Vvy 


2 P,P y U xy -P*U xx 


P 2 x U y> > 0 (12.32) 


(with all the derivatives evaluated at the critical values x" and y*). then the stationary value 
of U will assuredly be a maximum. The presence of the derivatives U xx , U vy , and U xy in 
(12.32) clearly suggests that meeting this condition would entail certain restrictions on 
the utility function and, hence, on the shape of the indifference curves. What are these 

restrictions? _ 

Considering first the shape of the indifference curves, we can show that a positive | H\ 
means the strict convexity of the (downward-sloping) indifference curve at the point of tan- 
gency E. Just as the downward slope of an indifference curve is guaranteed by a negative 
dyjdx (= -UJU X ), its strict convexity would be ensured by a positive d 2 y/dx 2 . To get 
the expression for d 2 yjdx 2 , we can differentiate -U x /U y with respect to x; but in doing 
so, wc should bear in mind not only that both U x and U y (being derivatives) are functions 
ofx and y but also that, along a given indifference curve, vis itself a function ofx. Accord¬ 
ingly, both U x and U v can be considered as functions ofx alone; therefore, we can get a 
total derivative 


d 2 y _ d_ / £4 
dx 2 dx \ (?,. 




dU x 

dx 


V 


. dU x 


dx 


(12.33) 


Since x can affect U x and U v not only directly but also indirectly, via the intermediary of>\ 


we have 


dU_ x 

lx 


dv 

Ux X +l\'-j- 

' dx 


dUy 

~dx 


y =u xy + + 


dj 

> y dx 


(12.34) 
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where dyfdx refers to the slope of the indifference curve, Now, at the point of tangency 
E —the only point relevant to the discussion of the second-order condition -this slope is 
identical with that of the budget constraint; that is, dv/dx = -P x /P v . Thus wc can rewrite 
(12.34) as 


^4 

dx 


= U X 




dUy P , 

dx 


O' 


yy 


P> 


Substituting (12.34') into (12.33) and utilizing the information that 


(1234') 


J P 

Ox = [from (12.31")] 

‘y 

and then factoring out U y /P 2 , we can finally transform (12.33) into 
d 2 y 2P x P v U xv -P;U xx -P 2 U yy \H\ 
dx 2 UyP; ~ U y P 2 


(12.33') 


It is clear that when the second-order sufficient condition (12.32) is satisfied, the second 
derivative in (12.33') is positive, and the relevant indifference curve is strictly convex at the 
point of tangency. In the present context, it is also true that the strict convexity of the indif¬ 
ference curve at the tangency implies the satisfaction of the sufficient condition (12.32), 
This is because, given that the indifference curves are negatively sloped, with no stationary 
points anywhere, the possibility of&zcmd 2 y/dx 2 value on a strictly convex curve is ruled 
out. Thus strict convexity can now result only in a positive d 2 v/dx 2 , and hence a positive 
|W|, by (12.33'). 

Recall, however, that the derivatives in |//| are to be evaluated at the critical values .r 
and y* only. Thus the strict convexity of the indifference curve, as a sufficient condition, 
pertains only to the point of tangency, and it is not inconceivable for the curve to contain 
a concave segment away from point E, as illustrated by the broken curve segment in 
Fig. 12.7a. On the other hand, if the utility function is known to be a smooth, increasing, 
strictly quasiconcave function, then every indifference curve will be everywhere strictly 
convex. Such a utility function has a surface like the one in Fig. 12.4ft. When such a surface 
is cut with a plane parallel to the xv plane, wc obtain for each of such cuts a cross section 
which, when projected onto the xy plane, becomes a strictly convex, downward-sloping 
indifference curve. In that event, no matter where the point of tangency may occur, the 
second-order sufficient condition will always be satisfied. Besides, there can exist only 
one point of tangency, one that yields the unique absolute maximum level of utility attain¬ 
able on the given linear budget. This result, of course, conforms perfectly to what the 
diamond on the right of Fig. 12.6 states. 

You have been repeatedly reminded that the second-order sufficient condition is not nec¬ 
essary. Let us illustrate here the maximization of utility while (12,32) fails to hold. Suppose 
that, as illustrated in Fig. 12.7ft, the relevant indifference curve contains a linear segment 
that coincides with a portion of the budget line. Then clearly we have multiple maxima, 
since the first-order condition^/ U v = P x /P y is now satisfied at every point on the linear 
segment of the indifference curve, including EE->, and E 3 . In fact, these are absolute 
constrained maxima. But since on a line segment d 2 yjdx 2 is zero, we have \H\ = 0 by 
(12,33'). Thus maximization is achieved in this case even though the second-order suffi¬ 
cient condition (12.32) is violated. 

The fact that a linear segment appears on the indifference curve suggests the presence 
of a slanted plane segment on the utility surface. This occurs when the utility function is 
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explicitly quasiconcave rather than strictly quasiconcave. As Fig. 12.76 shows, points E\, 
Ei, and £ 3 , all of which arc located on the same (highest attainable) indifference curve, 
yield the same absolute maximum utility under the given linear budget constraint. Refer¬ 
ring to Fig. 12.6 again, we note that this result is perfectly consistent with the message 
conveyed by the diamond on the left. 

Comparative-Static Analysis 

In our consumer model, the prices P x and P y are exogenous, as is the amount of the bud¬ 
get B. Ifwc assume the satisfaction of the second-order sufficient condition, we can analyze 
the comparative-static properties of the model on the basis of the first-order condition 
(12.31), viewed as a set of equations F J = 0 ( 7 = 1,2,3), where each F 1 function has 
continuous partial derivatives, As pointed out in (12.19), the endogenous-variable Jacobian 
of this set of equations must have the same value as the bordered Hessian; that is, 
|7| = \H\ . Thus, when the second-order condition (12.32) is met, \J\ must be positive and 
it does not vanish at the initial optimum. Consequently, (he implicit-function theorem is 
applicable, and we may express the optimal values of the endogenous variables as implicit 
functions of the exogenous variables: 

\* = K*{P X ,P V ,B) 

x* = x*[P x , P y , B) (12.35) 

/ = y*(P x ,P y ,B) 

These are known to possess continuous derivatives that give comparative-static informa¬ 
tion. In particular, the derivatives of the last two functions x* andy\ which are descriptive 
of the consumer’s demand behavior, can tell us how the consumer will react to changes in 
prices and in the budget. To find these derivatives, however, we must first convert (12.31) 
into a set of equilibrium identities as follows; 

B-x‘P x -y*P v = 0 

U x {x\y*)-k*P x =0 (12.36) 

17,(x*, y*)-k*P y ^0 

By taking the total differential of each identity in turn (allowing every variable to change), 
and noting that U Xj , = U vx . we then arrive at the linear system 

-Pr dx*-Py dy*=x‘dP x +y*dP y -dB 

-Py dV + J X( dx* + Vyy 4/ = MPy (12.37) 

-P y dk* + U yx dx* + U vy dy*= h >: dPy 

To study the effect of a change in the budget size (also referred to as the income of the 
consumer), let dP x = dP y - 0, but keep dB ^ 0. Then, after dividing { 12.37) through by 
dB, and interpreting each ratio of differentials as a partial derivative, we can write the 
matrix equation 1 

0 -P x -P v ] [or/M)"| |"-l" 

-P x Ur , U X y (3x7 BB) = 0 (12.38) 

-P v U xx U yv j l(ByydB)} 0 


f The matrix equation (12.38) can also be obtained by totally differentiating (12.36) with respect to 
S, while bearing in mind the implicit solutions in (12.35). 
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As you can verify, the array of elements in the coefficient matrix is exactly the same as 
what would appear in the Jacobian. |J|, which has the same value as the bordered Hessian 
\H\ although the latter has P x and P v (rather than ~P X and -P,,) in the first row and the 
first column. By Cramer's rule, we can solve for all three comparative-static derivatives, but 
we shall confine our attention to the following two: 




1 

~\ 


J I 


0 

-1 

-Py 

1 

-Px 

0 

Uxy 


-Py 

0 

Vy? 

Ml 

0 

-p. 

-1 

-1 

-P t 

u xx 

0 


-Pv 

U, x 

0 

1^1 


-P, 


U xy 


-Pv Uyv 


P* C,, 
Pv U v> 


(12.39) 


(12.40) 


By the second-order condition, \J\ = \H\ is positive, as arc P x and P v . Unfortunately, in 
the absence of additional information about the relative magnitudes of P,, P v . and the (/,,, 
we are still unable to ascertain the signs of these two comparative-static derivatives. This 
means that, as the consumer’s budget (or income) increases, his or her optimal purchases 
x* and y* may either increase or decrease. In case, say, x* decreases as B increases, prod¬ 
uct* is referred to as an inferior good as against a norma! good. 

Next, we may analyze the effect of a change in P x . Letting JP,. =dB = 0 this time, but 
keeping dP x / 0, and then dividing (12.37) through by dP x> we obtain another matrix 
equation: 


0 -P, -P.. 


-Pr 

-Pv 


U xx 

U vx 


U. 


XV 



(W/BP X ) 


V 


(3*73 p x ) 

= 

X* 


Oy*/3Pr) 


0 


from this, the following comparative-static derivatives emerge: 
/dx*\ 


i)Pv 


bP x 


1 

0 

X* 

-Pv 





-Px 


U x . 




Ml 

~Py 

0 

Uyy 




-X* 


Uxy 

1 * 

1 t 


0 

~Py 

mi 

"Pv 

Uyy 

1 Ml 

-Pv 

Uyy 

Tx + T 2 

[7) means the rth term] 

1 

0 

-Px 






-P x 

u x , 

. A* 




Ml 

~Py 

u, 3 

: 0 




** 

~Px 

u lx 



0 

~Px 

|7j 

~Py 

Uyx 

~\7\ 


-Py 

u yi 


= Ti + T A 


(12.41) 


(12.42) 


(12.43) 


How do we interpret these two results? The first one, (dx*/dP x ), tells how a change in 
P x affects the optimal purchase of*; it thus provides the basis for the study of our con¬ 
sumer’s demand function for *. There arc two component terms in this elfect. The first term, 
T\, can be rewritten, by using (12.39), as ~(8x‘fi)B)x*. In this light, T\ seems to be a 



380 Part Four Optimisation Problems 


measure of the effect of a change in B (budget, or income) upon the optimal purchase x*, 
with x* itself serving as a weighting factor. However, since this derivative obviously is con¬ 
cerned with a price change, T\ must be interpreted as the income effect of & price change. 
As P x rises, the decline in the consumer’s real income will produce an effect on x* similar 
to that of an actual decrease in S; hence the use of the term —(9*Understandably, 
the more prominent the place of commodity x in the total budget, the greater this income 
effect will be—and hence the appearance of the weighting factor x* in 7"]. This interpreta¬ 
tion can be demonstrated more formally by expressing the consumer’s effective income 
loss by the differential dB = —x*dP x . Then we have 


x 


t 


dB 

dK 


(12.44) 


and 



dx*\ dB 

3bJ7p x 


which shows T\ to be the measure of the effect of dP x on x* via B, that is. the income 


effect. 

If we now compensate the consumer for the effective income loss by a cash payment 
numerically equal to dB, then, because of the neutralization of the income effect, the 
remaining component in the comparative-static derivative (dx*!$P x ), namely, 7), will 
measure the change in a - 1 * due entirely to price-induced substitution of one commodity for 
another, i.e., the substitution effect of the change in P x . To see this more clearly, let us re¬ 
turn to (12.37), and consider how the income compensation will modify the situation. 
When studying the effect of dP x only (with dP y = <iB = 0). the first equation in (12.37) 
can be written as - P x dx* - P v dy* = a* dP x . Since the indication of the effective income 
loss to the consumer lies in the expression x*dP x (which, incidentally, appears only in the 
first equation), to compensate the consumer means to set this term equal to zero. If so. the 



'a* 


' 0 " 

vector of constants in (12.41) must be changed from 

A* 

to 

A* 


0 

! 

0 


and the income- 


compensated version of the derivative (0*7' )P X ) will be 


3a 


&P- 


x / compensated 


i 

7\ 


o o 

-p x r 

-P y 0 


-Py 

U xy 

Uyy 


A* 

0 

-f, 

~V\ 

-Py 

Iff 


= 'h 


Hence, we may express (12.42) in the form 


Bx*\ (dx*\ „ /dx* 

Sft - r ' + rj = “ M I+ U?> 


(12.42') 


x / compcTmied 


income eftect 


subsiitution effect 


This result, which decomposes the comparative-static derivative (Da*/^) into two com¬ 
ponents. an income effect and a substitution effect, is the two-good version of the so-called 
Slutsky equation. 

What can we say about the sign ol (i)A'/i)P^)?The substitution effect T : is clearly neg¬ 
ative, because |/| > 0 and A.* > 0 [see (12.3T)]. The income effect 7), on the other hand, 
is indeterminate in sign according to (12.39). Should it be negative, it would reinforce 73; 
in that event, an increase in P x must decrease the purchase of a, and the demand curve of 
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the utility-maximizing consumer would be negatively sloped. Should it be positive, but rel¬ 
atively small in magnitude, it would dilute the substitution effect, though the overall result 
would still be a downward-sloping demand curve. But in case 7) is positive and dominates 
Ti (such as when x‘ is a significant item in the consumer budget, thus providing an over¬ 
whelming weighting factor), then a rise in P, will actually lead to a larger purchase of i, 
a special demand situation characteristic of what are called Giffen goods. Normally, of 
course, we would expect [dx*/dP x ) to be negative. 

Finally, let us examine the comparative-static derivative in (12.43), {'dy*j' 6 P x ) = 
h + T 4 , which has to do with the cross effect of a change in the price of x on the optimal 
purchase of'y. The term 7’ 3 bears a striking resemblance to term T\ and again has the inter¬ 
pretation of an income effect/ Note that the weighting factor here is again *' (rather than 
y*); this is because we are studying the effect of a change in P x on effective income, which 
depends for its magnitude upon the relative importance of jc* (not y‘) in the consumer 
budget. Naturally, the remaining term, T 4 , is again a measure of the substitution effect. 

The sign of is, according to (12.40), dependent on such factors as U, x . U ,, x , etc,, and 
is indeterminate without further restrictions on the model, However, the substitution effect 
T 4 will surely be positive in our model, since a*, P x , P v and \J \ are all positive. This means 
that, unless more than offset by a negative income effect, an increase in the price of* will 
always increase the purchase of y in our two-commodity model, In other words, in the con¬ 
text of the present model, where the consumer can choose only between two goods, these 
goods must bear a relationship to each other as substitutes. 

Even though the preceding analysis relates to the effects of a change in P v , our results 
are readily adaptable to the case of a change in P v . Our model happens to be such that 
the positions occupied by the variables x andy are perfectly symmetrical. Thus, to infer the 
effects of a change in P v , all that it takes is to interchange the roles of* andy in the results 
already obtained. 


Proportionate Changes in Prices and Income 

It is also of interest to ask how x* andy* will be affected when all three parameters P,. P,.. 
and B are changed in the same proportion. Such a question still lies within the realm of 
comparative statics, but unlike the preceding analysis, the present inquiry now involves the 
simultaneous change of all the parameters. 

When both prices arc raised, along with income, by the same multiple /, every term in 
the budget constraint will increase /-fold, to become 

jB-jxP : -jyP v = 0 


Inasmuch as the common factor j can be canceled out, however, this new' constraint is in fact 
identical with the old. The utility function, moreover, is independent of these parameters. 
Consequently, the old equilibrium levels of* andy will continue to prevail; that is, the con¬ 
sumer equilibrium position in our model is invariant to equal proportionate changes in all 
the prices and in the income. Thus, in thy present model, the consumer is seen to be free 
from any “money illusion.’’ 


f If you need a stronger dose of assurance that r 3 represents the income effect, you can use (12.40) 
and (12.44) to write 


h 


SB ) \ /IB ) dP, 


Thus Ti is th6 effect of a change in P K on y* via the income factor B . 
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Symbolically, this situation can be described by the equations 

x*(Px> Py< B)=x' , tiP,.jP y JB) 
/(P x ,P y ,B)=y*(jP s ,jP y ,jB) 

The functions x* and y\ with the invariance property just cited, are no ordinary functions; 
they are examples of a special class of function known as homogeneous functions, which 
have interesting economic applications. We shall therefore examine these in Sec. 12.6. 


EXERCISE 12.5 

1. Given U - (x + 2)(y +1) and P * = 4, P y = 6, and 8 = 130: 

(o) Write the Lagrangian function, 

(f?) Find the optimal levels of purchase x’ and y*. 

(c) Is the second-order sufficient condition for maximum satisfied? 

(d) Does the answer in (6) give any comparative-static information? 

2. Assume that U = (x + 2)(y +1), but this time assign no specific numerical values to 
the price and income parameters. 

(a) Write the Lagrangian function. 

(b) Find x*, y", and \* in terms of the parameters P„ P y , and B. 

(c) Check the second-order sufficient condition for maximum, 

(d) By setting P x =4, P y = 6, and B = 130, check the validity of your answer to 
Prob. 1. 

3. Can your solution (x’ and y*) in Prob. 2 yield any comparative-static information? Find 
all the comparative-static derivatives you can, evaluate their signs, and interpret their 
economic meanings. 

4. From the utility function U = (x + 2)(y+1) and the constraint xP x + yP Y .= B of 
Prob. 2, we havejlready found the U :l and |H|, as well as x' and a’. Moreover, we 
recall that If I = \H\. 

(a) Substitute these into (12,39) and (12,40) to find {tix'/dB) and 

(b) Substitute into (12.42) and (12.45) to find (dx m /BP x ) and (3y7i)P*)- 
Do these results check with those obtained in Prob. 3? 

5. Comment on the validity of the statement: "If the derivative (ax*/aP^) is negative, 
then x cannot possibly represent an inferior good." 

6. When studying the effect of dP f alone, the first equation in (12.37) reduces to 
- p„ dx* - P y dy' - x* dP,, and when we compensate for the consumer's effective in¬ 
come loss by dropping the term x*dP Xl the equation becomes -P, dx’ - P y dy‘ = 0. 
Show that this last result can be obtained alternatively from a compensation procedure 
whereby we try to keep the consumer's optimal utility level U* (rather than effective 
income) unchanged, so that the term 7j can alternatively be interpreted as 
(3x*/3P x )y.constant. [Hint: Make use of (12.31'').] 

7. (a) Does the assumption of diminishing marginal utility to goods x and y imply strictly 

convex indifference curves? 

(b) Does the assumption of strict convexity in the indifference curves imply diminish¬ 
ing marginal utility to goods x and y? 
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12.6 Homogeneous Functions _ 

A function is said to be homogeneous of degree r, if multiplication of each of its indepen¬ 
dent variables by a constant / will alter the value of the function by the proportion /' , that 
is, if 


f(jX\, • - • . jx„) = j'f(X[, ... ,x, t ) 

In general, / can take any value. However, in order for The preceding equation to make 

sense, (jx .. jx„) must not lie outside the domain of the function/ For this reason, in 

economic applications the constant j is usually taken to be positive, as most economic vari¬ 
ables do not admit negative values. 


Example 1 


Given the function f(x, y, w) = x/y+2w/2x, if we multiply each variable by /, we get 
... . . . (jx) 2(/w) x 2 w n 

= ,(*, r , = ;»,(*, y,») 


In this particular example, the value of the function will not be affected at all by equal pro¬ 
portionate changes in all the independent variables; or, one might say, the value of the 
function is changed by a multiple of j° (= 1 ). This makes the function ! a homogeneous 
function of degree zero. 


You will observe that the functions x* and y* cited at the end of Sec. 12.5 arc both 
homogeneous of degree zero. 


Example 2 


When we multiply each variable in the function 


g(x, y, w) - 



Y 


2w 2 

x 


by j, we get 


9(1*, iy, jw) = 


(jx) 2 

(iy) 


2 (jw) 2 ___■ l x 2 2w 2 

(i*) ~ 1 v y + ~ 


jg(x, y, W) 


The function g is homogeneous of degree one (or, of the first degree); multiplication of 
each variable by j will alter the value of the function exactly /-fold as well. 


Example 3 Now, coris ^ er the function h(x, y, w) = 2x 1 + 3 yw - w 2 , A similar multiplication this time 
- will give us 

Kj*. jy, jw) = 2 (jxf + 3 (jy)(jw) - (jw) 2 = j 2 h(x, y, w) 

Thus the function h is homogeneous of degree two; in this case, a doubling of all variables, 
for example, will quadruple the value of the function. 

Linear Homogeneity 

In the discussion of production functions, wide use is made of homogeneous functions of 
the first degree. These are often referred to as linearly homogeneous functions, the adverb 
linearly modifying the adjective homogeneous. Some writers, however, seem to prefer 
the somewhat misleading terminology linear homogeneous functions, or even linear and 
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homogeneous functions, which tends to convey, wrongly, the impression that the functions 
themselves are linear. On the basis of the function g in Example 2, we know that a function 
which is homogeneous of the first degree is nut necessarily linear in itself. Hence you 
should avoid using the terms “linear homogeneous functions” and “linear and homoge¬ 
neous functions” unless, of course, the functions in question are indeed linear. Note, how¬ 
ever, that it is not incorrect to speak of “linear homogeneity," meaning homogeneity of 
degree one, because to modify a noun (homogeneity) does call for the use of an adjective 
(linear). 

Since the primary field of application of linearly homogeneous functions is in the theory 
of production, let us adopt as the framework of our discussion a production function in the 
form, say, 

Q = f(K, /.) (12.45) 


Whether applied at the micro or the macro level, the mathematical assumption of linear ho¬ 
mogeneity would amount to the economic assumption of constant returns to scale, because 
linear homogeneity means that raising all inputs (independent variables)./'-fold will always 
raise the output (value of the function) exactly /-fold also. 

What unique properties characterize this linearly homogeneous production function? 

Property I Given the linearly homogeneous production function Q = f{K, L ), the aver¬ 
age physical product of labor (APP/.) and of capital (APP*) can be expressed as functions 
of the capital-labor ratio, A = K/l, alone. 


To prove this, we multiply each independent variable in (12.45) by a factor j — 1/L. By 
virtue of linear homogeneity, this will change the output from Q to ;Q - Q/L. The right 
side of (12.45) will correspondingly become 




Since the variables K and L in the original function are to be replaced (whenever they 
appear) by k and 1, respectively, the right side in effect becomes a function of the 
capital-labor ratio k alone, say, </>(k), which is a function with a single argument, k. even 
though two independent variables K and Z, are actually involved in that argument. Equating 
the two sides, we have 

APP t ^-=#fr) (12,46) 

Lj 


The expression for APP* is then found to be 


APP a : = 


Q 

K 


QL 

L K 


k 


(12.47) 


Since both average products depend on k alone, linear homogeneity implies that, as long 
as the K/L ratio is kept constant (whatever the absolute levels of K and /.), the average 
products will be constant, too. Therefore, while the production function is homogeneous of 
degree one, both APP/, and AP Pa- arc homogeneous of degree zero in the variables K and 
L , since equal proportionate changes in A - and L (maintaining a constant k) will not alter the 
magnitudes of the average products. 
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Property 11 Given a linearly homogeneous production function Q = f{K, L), the mar¬ 
ginal physical products MPP £ . and MPP A - can be expressed as functions of k alone. 

To find the marginal products, we first write the total product as 


Q = Lm [by (12.46)] (12.45') 

and then differentiate Q with respect to K and L. For this purpose, we shall find the follow¬ 
ing two preliminary results to be of service: 


3 k 8 fK\ 1 9 k 8 /K\ -K 

dK~lK[T)~L dL~8L\Tj~lX (12 ' 48) 


The results of differentiation are 

MPP « = 3 i - ix lLm] 

_ r ty{k) _ L d<P(k) 8k 
~ 8K ~ dk 8K 

= L$(k) QW(A) 


[chain rule] 

[by( 12.48)] (12.49) 


MPP/ 


8Q a 
11 = 

C it 

dk 

= *(*) + £*'(*)— 
dl- 

= <t>(k) + L<p'{k) K 


L- 


=m-k<t>'{k) 


[product rule] 
[chain rule] 
[by (12.48)] 


(12.50) 


which indeed show that MPP and MPP/. are functions of A alone. 

Like average products, the marginal products will remain the same as long as the 
capital labor ratio is held constant; they arc homogeneous of degree zero in the variables K 
and I. 


Property III (Euler’s theorem) If Q = f(K , L) is linearly homogeneous, then 


Prook 


80 80 
+ s Q 
8 K 8L * 


80 80 

af + 1 ar = mk) + L[m ~ k4>ik)] 


= K0\k)+L0(k)-K<p , (k) 
= L4>{k) = Q 


[by (12.49), (12.50)] 

[* ^ KIL] 

[by (12.45')] 


Note that this result is valid for any values of K and this is why the property can be 
written as an identical equality. What this property says is that the value of a linearly 
homogeneous function can always be expressed as a sum of terms, each of which is the 
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product of one of the independent variables and the first-order partial derivative with 
respect to that variable, regardless of the levels of the two inputs actually employed. Be 


careful, however, to distinguish between the identity K 


dK 



= Q [ Euler's theorem , 


which applies only to the constant-rcturns-to-scale case of Q — f(K , I)] and the equation 

dQ- —dK + 7 —clL [total differential of Q, for any function Q = f(K , /.)]. 

BK Bl. 

Economically, this property means that under conditions of constant returns to scale, if 
each input factor is paid the amount of its marginal product, the total product will be 
exactly exhausted by the distributive shares for all the input factors, or, equivalently, the 
pure economic profit will be zero. Since this situation is descriptive of the long-run equi¬ 
librium under pure competition, it was once thought that only linearly homogeneous pro¬ 
duction functions would make sense in economics. This, of course, is not the case. The zero 
economic profit in the long-run equilibrium is brought about by the forces of competition 
through the entry and exit of firms, regardless of the specific naiure of the production func¬ 
tions actually prevailing. Thus it is not mandatory to have a production function that 
ensures product exhaustion lor any and all (K, L) pairs. Moreover, when imperfect compe¬ 
tition exists in the factor markets, the remuneration to the factors may not be equal to the 
marginal products, and, consequently, Euler's theorem becomes irrelevant to the distribu¬ 
tion picture. However, linearly homogeneous production functions are often convenient to 
work with because of the various nice mathematical properties they are known to possess. 


Cobb-Douglas Production Function 

One specific production function widely used in economic analysis (earlier cited in 
Sec. 11.6, Example 5) is the Cobb-Doughs production Junction: 

Q = (12.51) 

where A is a. positive constant, and a is a positive fraction. What we shall consider here first 
is a generalized version of this function, namely, 

Q = AK a Lfi (12.52) 

where fi is another positive fraction which may or may not be equal to 1 - o'. Some of the 
major features of this function are: (I) it is homogeneous of degree (a + fi): (2) in the spe¬ 
cial case of a + fi = 1, it is linearly homogeneous; (3) its isoquants arc negatively sloped 
throughout and strictly convex for positive values of K and L\ and (4) it is strictly quasi¬ 
concave for positive K and L. 

Its homogeneity is easily seen from the fact that, by changing K and L to jK mid jL, 
respectively, the output will be changed to 

A{jKf{jLf = f +p (AK a L fi ) =j a+p Q 

That is, the function is homogeneous of degree (u + ft). In case a + ft = 1. there will be 
constant returns to scale, because the function will be linearly homogeneous. (Note, how¬ 
ever, that this function is not linear! It would thus be confusing to refer to it as a “ linear 
homogeneous” or “ linear and homogeneous” function.) That its isoquants have negative 
slopes and strict convexity can be verified from the signs of the derivatives dK/dL and 
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d 2 K/dL 2 (or the signs of dLjdK and d 2 LjdK 1 ). For any positive output Qq, (12.52) 
can be written as 


AK a LP = Q» (A t K t L,Q a >0) 

Taking the natural log of both sides and transposing, we find that 

In A + a In K + f) In I. — In Qq = 0 

which implicitly defines K as a function of L. f By the imp] icit-function rule and the log 
rule, therefore, we have 

dK _ 9 / 7 SL _ (p/L) _ UK 

dL ~ dF/dK ~ ( a/K) al K 

Then it follows that 

dL 2 dL \ ctL J a dL l L ) a L 2 V dL j 


The signs of these derivatives establish the isoquant (any isoquant) to be downward-sloping 
throughout and strictly convex, in the LK plane for positive values of K and L. This, of 
course, is only to be expected from a function that is strictly quasiconcave for positive K 
and L. For the strict quasiconcavity feature of this function, see Example 5 of Sec. 12.4, 
where a similar function was discussed. 

Let us now examine the a 4- fi = 1 case (the Cobb-Douglas function proper), to verify 
the three properties of linear homogeneity cited earlier. First of all, the total product in this 
special case is expressible as 

Q = AK‘ s L'~ tl = a(~\ L=LAk a (12.5V) 


where the expression Ak a is a specific version of the general expression <fi(k) used before. 
Therefore, the average products arc 


APP,. 

APP* 


Q 

L 

Q 

K 


Ak a 


Q L _ Ak a 

Ik~~V 


Ak a 


(12.53) 


both of which arc now functions of k alone. 

Second, differentiation of Q = AK a L ] ~ a yields the marginal products: 


~ = AaK u - ] L~ ia ~ l) = Aa (-) = Aak a ~ l 

dK \ L J 

dO fK\ a 

™ = AK a {\ — a)L~ u = A(\-a)lj j = A( 1 -a)k° 


and these arc also functions of k alone. 


f The conditions of the implicit-function theorem are satisfied, because F (the left-side expression) has 
continuous partial derivatives, and because df/QK =tx/K ^ 0 for positive values of K, 
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Last, we can verify Euler’s theorem by using (12,54) as follows: 


K— + L— = KAetff- 1 -f 
BK BL 

= lAr(*° 


LA(1 -a)k° 


- a 


\Lk 

= LAk?(a + 1 - a) = LAI? - Q 


[by (12.51')] 


Interesting economic meanings can be assigned to the exponents ce and (1 — 0 ') in the 
linearly homogeneous Cobb-Douglas production function. If each input is assumed to be 
paid by (lie amount of its marginal product, the relative share of total product accruing to 
capital will be 

K(BQi'dK) KAak 

------— ~ a 

Q I.Ak“ 


Similarly, labor's relative share will be 


L(BQ/3L) LA{\ -a)k u 
Q “ LAP 


Thus the exponent of each input variable indicates the relative share of that input in the 
total product. Looking at it another way, we can also interpret the exponent of each input 
variable as the partial elasticity of output with respect to that input. This is because the 


capital-share expression just given is equivalent to the expression 


BQf'dK 

Q/K 


= £qk 


and, 


similarly, the labor-share expression just given is precisely that of Eql ■ 

What about the meaning of the constant A? For given values of K and L, the magnitude 
of A will proportionately affect the level of Q. Hence A may be considered as an efficiency 


parameter, i.e„ as an indicator of the state of technology. 


Extensions of the Results 

We have discussed linear homogeneity in the specific context of production functions, but 
the properties cited arc equally valid in other contexts, provided the variables K, L, and Q 
arc properly reinterpreted. 

Furthermore it is possible to extend our results to the case of more than two variables. 
With a linearly homogeneous function 

y = f{x[,x 2 ,...,x g ) 

we can again divide each variable by X| (that is, multiply by f fx\ ) and get the result 


) [homogeneity of degree 1] 

*l'*l X\) 

which is comparable to (12.45'). Moreover, Euler’s theorem is easily extended to the form 


^XjJ] = y [Euler’s theorem] 
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where the partial derivatives of the original function / (namely, f) arc again homogeneous 
of degree zero in the variables xt, as in the two-variable case. 

The preceding extensions can. in fact, also be generalized with relative ease to a homo¬ 
geneous function of degree r. In the first place, by definition of homogeneity, we can in the 
present case write 

v = a*[ 0 ( —, —, — — ) [homogeneity of degree r] 

\X[ X\ X] J 

The modified version of Euler’s theorem will now appear in the form 

n 

T. x if — O' [Euler’s theorem] 

where a multiplicative constant r has been attached to the dependent variable v on the right. 
And, finally, the partial derivatives of the original function /, the f, will all be homoge¬ 
neous of degree (r — 1) in the variables Xj. You can thus see that the linear-homogeneity 
case is merely a special case thereof, in which r = 1. 


EXERCISE 12.6 


1. Determine whether the following functions are homogeneous. If so, of what degree? 

(a) f(x, y). = ■/*? (d) f(x, y) = 2x + y+ 3^x7 

(£)) f(x, y) = (X 2 - y 2 )' 22 (e) f{x, y lW )=^-+ 2m 

(c) f(*,y) = x 3 -fly-by 3 (0 f(x,y,w) = - 5yw 2 

2. Show that the function (12.45) can be expressed alternatively as Q = K $ 

°fQ=L*(“). 

3. Deduce from Euler’s theorem that, with constant returns to scale: 

(a) When MPP* = 0, APPt is equal to MPP t . 

(b) When MPP ; = 0, APP* is equal to MPP*-. 

4. On the basis of (12.46) through (12.50), check whether the following are true under 
conditions of constant returns to scale; 

(a) An APP t curve can be plotted against k (= K/L) as the independent variable (on 
the horizontal axis). 

(b) MPPk is measured by the slope of that APPt curve. 

(c) APPv is measured by the slope of the radius vector to the APP t curve. 

(d) MPPi = APPt -fe(MPP*)=APP t -k (slope of APP t ). 

5. Use (12.53) and (12.54) to verify that the relations described in Prob. 4b, c, and d are 
obeyed by the Cobb-Douglas production function. 

6. Given the production function Q= AK°L ti , show that: 

(c) <* + 0 > 1 implies increasing returns to scale. 

(b) a + 0 < 1 implies decreasing returns to scale. 

(c) a and fi are, respectively, the partial elasticities of output with respect to the capital 
and labor inputs. 


— I instead 
K 
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7. Let output be a function of three inputs: Q = AK a L b N c . 

(a) Is this function homogeneous? If so, of what degree? 

(b) Under what condition would there be constant returns to scale? Increasing returns 
to scale? 

(c) Find the share of product for input N, if it is paid by the amount of its marginal 
product. 

8. Let the production function Q = g(K, L) be homogeneous of degree 2. 

(a) Write an equation to express the second-degree homogeneity property of this 
function. 

(b) Find an expression for 0 in terms of <t>(k), in the vein of (12,45'). 

(c) Find the MPP* function. Is MPP r still a function of k alone, as in the linear- 
homogeneity case? 

(d) Is the MPPk function homogeneous in K and 1? If so, of what degree? 


12.7 Least-Cost Combination of Inputs _ 

As another example of constrained optimization, let us discuss the problem of finding the 
least-cost input combination for the production of a specified level of output Q 0 represent¬ 
ing. say. a customer’s special order. Here we shall work with a general production function; 
later on, however, reference will be made to homogeneous production functions. 


First-Order Condition 

Assuming a smooth production function with two variable inputs, Q = Q{a, />), where 
Q, n Qb > 0, and assuming both input prices to be exogenous (though again omitting the 
zero subscript), we may formulate the problem as one of minimizing the cost 

C = aP a + !)P„ 

subject to the output constraint 

Q(«,b) = Q 0 

Hence, the Lagrangian function is 

Z = aP a + hP b + /i[0 n - Q(a, *)] 

To satisfy the first-order condition for a minimum C, the input levels (the choice vari¬ 
ables) must satisfy the following simultaneous equations: 

Z li = Q 0 -Q(a.b) = 0 
2, = P„ -tiQ a = 0 
Zb = Pi,- )i Qb — o 

The first equation in this set is merely the constraint restated, and the last two imply the 
condition 
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At the point of optimal input combination, the input-price-marginal-product ratio must be 
the same for each input. Since this ratio measures the amount of outlay per unit of marginal 
product of the input in question, the Lagrange multiplier can be given the interpretation of 
the marginal cost of production in the optimum state. This interpretation is, of course, en¬ 
tirely consistent with our earlier discovery in (12.16) that the optimal value of the Lagrange 
multiplier measures the comparative-static effect of the constraint constant on the optimal 
value of the objective function, that is, n* = (§C7§£?o)> w ^ ere ’he § symbol indicates that 
this is a partial total derivative. 

Equation (12.55) can be alternatively written in the form 


p !L= Q 1 

h Qh 


( 12 . 55 ') 


which you should compare with (12.31"). Presented in this form, the first-order condition 
can be explained in terms of isoquants and isocosts. As we learned in (11.36), the QJQh 
ratio is the negative of the slope of an isoquant; that is, it is a measure ol'the marginal rale 
of technical substitution of a for b (MRTS n /,). In the present model, the output level is spec¬ 
ified at Qp ; thus only one isoquant is involved, as shown in Fig. 12.8. with a negative slope. 

The P„ / Pb ratio, on the other hand, represents the negative of the slope of isocosts (a no¬ 
tion comparable with the budget line in consumer theory). An isocost, defined as the locus 
of the input combinations that entail the same total cost, is expressible by the equation 


Co -aP a +bP h 


or 


h 


Cb 

Ph 


P a 



where Co stands for a (parametric) cost figure. When plotted in the ab plane, as in Fig. 12.8, 
therefore, it yields a family of straight lines with (negative) slope —P„/Pb (and vertical 
intercept Co/Pb)- The equality of the two ratios therefore amounts to the equality of the 
slopes of the isoquant and a selected isocost. Since we are compelled to stay on the given 
isoquant, this condition leads us to the point of tangency E and the input combination 
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Second-Order Condition 

To ensure a minimum cost, it is sufficient (after the first-order condition is met) to have a 
negative bordered Hessian, i.c,, to have 



0 Q a Q h 
Qa "“t^Qaa ~i^Quh 
Qb ~t x Qba ~P-Qhb 


= n(Qo,iQi-2QabQaQh + QhbQl) <0 


Since the optimal value of p. (marginal cost) is positive, this reduces to the condition that 
the expression in parentheses be negative when evaluated at E. 

From (11.44), we recall that the curvature of an isoquant is represented by the second 
derivative 


~T~2 — q! (Q<*aQh 2 QabQaQh + QbbQa) 

in which the same parenthetical expression appears. Inasmuch as Qh is positive, the satis¬ 
faction of the second-order sufficient condition would imply that d 2 b/cia 2 is positive—that 
is, the isoquant is strictly convex -at the point of tangency. In the present context, the strict 
convexity of the isoquant would also imply the satisfaction of the second-order sufficient 
condition. For, since the isoquant is negatively sloped, strict convexity can mean only a pos¬ 
itive d 2 b/da 2 (zero d 2 b/da 2 is possible only at a stationary point on the isoquant), which 
would in turn ensure that \H\ < 0. However, it should again be borne in mind that the suf¬ 
ficient condition |fl| < 0 (and hence the strict convexity of the isoquant) at the tangency is, 
per sc, not necessary for the minimization of C. Specifically. C can be minimized even 
when the isoquant is (nonstrictly) convex, in a multiple-minimum situation analogous to 
Fig. 12.7 b, with d 2 b/da 2 = 0 and 1 1!\ = 0 at each minimum. 

In discussing the utility-maximization model (Sec. 12.5), it was pointed out that a 
smooth, increasing, strictly quasiconcave utility function U — J(x, y) gives rise to every¬ 
where strictly convex, downward-sloping indifference cutvos in the xy plane. Since the 
notion of isoquants is almost identical with that of indifference curves/ we can reason by 
analogy that a smooth, increasing, strictly quasiconcave production function Q = Q(a, b) 
can generate everywhere strictly convex, downward-sloping isoquants in the ah plane. If 
such a production function is assumed, then obviously the second-order sufficient condi¬ 
tion will always be satisfied. Moreover, it should be clear that the resulting C' will be a 
unique absolute constrained minimum. 


The Expansion Path 

Let us now turn to one of the comparative-static aspects of this model. Assuming a fixed 
ratio of the two input prices, let us postulate successive increases of Qq (ascent to higher 
and higher isoquants) and trace the effect on the least-cost combination b*/a*. Each shift 
of the isoquant, of course, will result in a new point of tangency. with a higher isocost. The 
locus of such points of tangency, known as the expansion path of the firm, serves to de¬ 
scribe the least-cost combinations required to produce varying levels of Two possible 
shapes of the expansion path are shown in Fig. 12.9. 


* Both are in the nature of "isovalue" curves. They differ only in the field of application; indifference 
curves are used in models of consumption, and isoquants, in models of production. 
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FIGURE 12.9 




If we assume the strict convexity of the isoquants (hcncc, satisfaction of the second- 
order condition), the expansion path will be derivable directly from the first-order condition 
(12.55'). Let us illustrate this for the generalized version of the Cobb-Douglas production 
function. 

The condition (12.55') requires the equality of the input-price ratio and the marginal- 
product ratio, for the function Q = Aa a b t> , this means that each point on the expansion 
path must satsify 

Pg _ Qa _ A<xP-W _ ah 

P b Q b Aa^bP-' Pa { ' } 

implying that the optimal input ratio should be 


b* 

a* 


-= a constant 

aPb 


(12.57) 


since a, /?, and the input prices are all constant. As a result, all points on the expansion path 
must show the same fixed input ratio; i.e., the expansion path must be a straight line ema¬ 
nating from the point of origin. This is illustrated in Fig. 12.9ft, where the input ratios at the 
various points oftangency ( AE/OA , A f E'/OA\ and A”E"/OA") are all equal. 

The linearity of the expansion path is characterisitc of the generalized Cobb-Douglas 
function whether or not a + ft = 1, because the derivation of the result in (12.57) does not 
rely on the assumption a + ft = 1. As a matter of fact, any homogeneous production func¬ 
tion (not necessarily the Cobb-Douglas) will give rise to a linear expansion path for each 
set of input prices, because of the following reason: if it is homogeneous of (say) degree r, 
both marginal-product functions Q a and Q b must be homogeneous of degree (r — 1) in the 
inputs a and b\ thus a /-fold increase in both inputs will produce a f ‘-fold change in 
the values of both Q a and Q h , which will leave the Q a /Qt> ratio intact. Therefore, if the 
first-order condition P a jPh = Q a /Qb is satisfied at given input prices by a particular input 
combination (u 0 , bo), it must also be satisfied by a combination (ja 0 , jh )—precisely as is 
depicted by the linear expansion path in Fig. 12.9 b. 

Although any homogeneous production function can give rise to a linear expansion path, 
the specific degree of homogeneity does make a significant difference in the interpretation 
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of the expansion path. In Fig. 12.9 b, we have drawn the distance OE equal to that EE\ so 
that point £' involves a doubling of the scale of point E. Now if the production function is 
homogeneous of degree one, the output at E' must be twice (2 1 = 2) that of E. But if the 
degree of homogeneity is two, the output at E' will be four times (2 2 = 4) that off. Thus, 
the spacing of the isoquants for Q = 1, Q = 2,..., will be widely different for different 
degrees of homogeneity. 


Homothetic Functions 

We have explained that, given a set of input prices, homogeneity (of any degree) of the pro¬ 
duction function produces a linear expansion path. But linear expansion paths are not 
unique to homogeneous production functions, for a more general class of functions, known 
as homothetic Junctions, can produce them, too. 

Homothcticity can arise from a composite function in the form 

H = k[Q(a,b)\ (12.58) 

where Q(a, h) is homogeneous of degree r. Although derived from a homogeneous func¬ 
tion, the function H — ti(a, b) is in general not homogeneous in the variables a and b. 
Nonetheless, the expansion paths of H(a, b ), like those of Q(a, b ), arc linear, The key to 
this result is that, at any given point in the ab plane, the H isoquant shares the same slope 
as the Q isoquant: 

m ,, r . „ Hu WQ)Qa 

Slope of fflS o qU »=— - = — —— 

= -^- = slope of Q isoquant (12.59) 

Qb 

Now the linearity of the expansion paths of Q{a,b) implies, and is implied by, the 
condition 


0 * 

Qb 


constant for any given - 
a 


In view of (12.59), however, we immediately have 


Ha 

H b 


= constant for any given 


b 

a 


(12.60) 


as well, And this establishes that H{a, b) also produces linear expansion paths. 

The concept of homotheticity is more general than that of homogeneity. In fact, every 
homogeneous function is automatically a member of the homothetic family, but a homo¬ 
thetic function may be a function outside the homogeneous family. The fact that a homo¬ 
geneous function is always homothetic can be seen from (12.58), where if wc let the 
function H = h(Q) take the specific form H = (9—with k'( Q) = dHjdQ = 1 then 
the function Q , being identical with the function H itself, is obviously homothetic. That 
a homothetic function may not be homogeneous will be illustrated in Example 2. which 
follows. 

In defining the homothetic function//, we specified in (12.58) that h'(Q) ^ 0. This en¬ 
ables us to avoid division by zero in (12.59). While the specification h'(Q) ^ 0 is the only 
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Example 1 


Example 2 


requirement from the mathematical standpoint, economic considerations would suggest the 
stronger restriction h'{Q) > 0. For if H(a, b) is, like Q(a, b)> to serve as a production 
function, that is, if ti is to denote output, then H a and H b should, respectively, be made to 
go in the same direction as Q a and Q h in the Q(a, b) function. Thus H(a , b) needs to be 
restricted to be a monotonically increasing transformation of Q(a, b). 

Homothetic production functions (including the special case of homogeneous ones) 
possess the interesting property that the (partial) elasticity of optimal input level with 
respect to the output level is uniform for all inputs. To sec this, recal I that the linearity of 
expansion paths of homothetic functions means that the optimal input ratio b*/a* is unaf¬ 
fected by a change in the exogenous output level Hq. Thus d(b*/a*)/dH o = 0 or 


db' 


da* 


m' b m =0 [quotientrule] 


Multiplying through by a* 2 He,, and rearranging, we then get 
da* //„ db* Ho 


dH 9 a* 


dHo b • 


or 




which is what we previously asserted. 


Let H = Q 2 , where Q = Aa a b fl . Since Q(a b) is homogeneous and h'(Q) = 2Q is positive 
for positive output, H(a, b) is homothetic for Q > 0. We shall verify that it satisfies (12.60). 
First, by substitution, we have 

H = Q 2 = (Ao'V ) 2 = A 2 a 2l 'b 2fi 

Thus the slope of the isoquants of H is expressed by 

H 0 A 2 2aa 2u ^b 2p ab 

H b ~ A*cP2fibto-' ~ 0a (12 ' 61) 

This result satisfies (12,60) and implies linear expansion paths. A comparison of (12.61) 
with (12.56) also shows that the function H satisfies (12.59). 

In this example, Q(o, b) is homogeneous of degree (a + 0). As it turns out, H(a, b ) is also 
homogeneous, but of degree 2(a + fS). As a rule, however, a homothetic function is not 
necessarily homogeneous. 


Let H - e Q , where Q = Aa a b p . Since Q(o, b) is homogeneous and /f(Q) = e Q is positive, 
H(a, b) is homothetic. From this function, 

H{a,b) = exp(Aa“b p ) 

it is easily found that 

H a _ Aaa' J ~ ] b t> exp(Aa a b p ) ab 
H b ~ Aa“0b p 1 exp(Ao u b^) - ~Ja 

This result is, of course, identical with (12.61) in Example 1. This time, however, the homo¬ 
thetic function is not homogeneous, because 

H(ja, jb ) = exp [A(jaf(jb) p ] = exp (Aa°b if j a+p ) 

= [exp 1,1 = [H(a, b)]l~“ * j'H(a, b ) 
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Elasticity of Substitution 

Another aspect of comparative statics has to do with the effect of a change in the P a j P b 
ratio upon the least-cost input combination b*/a* for producing the same given output 
Q () (that is, while we stay on the same isoquant). 

When the (exogenous) input-price ratio P a /P h rises, we can normally expect the opti¬ 
mal input ratio b*/a* also to rise, because input b (now relatively cheaper) will tend to be 
substituted for input a. The direction of substitution is clear, but what about its extend The 
extent of input substitution can be measured by the following point-elasticity expression, 
called the elasticity of substitution and denoted by a (lowercase Greek letter sigma, lor 
“substitution”): 

d(b*/a r ) d(b*/a *) 

_ relative change in (If/a*) _ b*/a* _ d(P a /P b ) 

relative change in (P a /Pb) djPg/Pb) b*/ a * 

Pa/Pb Pa/Pb 

The value of a can be anywhere between 0 and oo; the larger the a, the greater the substi¬ 
tutability between the two inputs. The limiting case of a = 0 is where the two inputs must 
be used in a fixed proportion as complements to each other. The other limiting case, with <r 
infinite, is where the two inputs are perfect substitutes for each other. Note that, if (b*/a*) 
is considered as a function of (P a /Ph), then the elasticity a will again be the ratio of a mar¬ 
ginal function to an average function. 1 

For illustration, let us calculate the elasticity of substitution for the generalized Cobb- 
Douglas production function. We learned earlier that, for this case, the least-cost input 
combination is specified by 


a•) u\Pb 


(from (12.57)] 


This equation is in the form y = ax, for which dv/dx (the marginal) and y/x (the average) 
are both equal to the constant a. That is, 

d(b*la*) = f and bja*_ = f 

d(P a /P b ) v 311 Pa/Pb ot 

Substituting these values into (.12.62), we immediately find thatj = I; that is. the general¬ 
ized Cobb-Douglas production function is characterized by a constant, unitary elasticity of 
substitution. Note that the derivation of this result in no way relics upon the assumption that 
<x + fi = 1, Thus the elasticity of substitution of the production function Q = Aa“l/ will 
be unitary even if cr -t- /I yM. 


1 There is an alternative way of expressing a. Since, at the point of tangency, we always have 

fa Qo 

-— —— MRTSflt 

n> Qt> 


the elasticity of substitution can be defined equivalently as 


d (b'/tf) 

relative change in (if/a’) & jo' 

relative change in MRTS rt - d(Q g / Qt) 

Qn/Qb 


d(b'/a') 

d(Q*/Q h ) 
tr /g - 

Qo/Qb 


(12,62') 
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CES Production Function 

More recently, there has come into common use another form of production function 
which, while still characterized by a constant elasticity of substitution (CES), can yield as 
a (7 with a (constant) value other than 1 f The equation of this function, known as the CES 
production function, is 

Q = A[bK- p ->r{\ -$)L- p ] 1/p (A > 0; 0 < 8 < 1;—1 <p? 0) (12.63) 

where K and L represent two factors of production, and A, <$, and p (lowercase Greek letter 
rho) arc three parameters. The parameter A (the efficiency parameter) plays the same role 
as the coefficient A in the Cobb-Douglas function; it serves as an indicator of the state of 
technology. The parameter 8 (the distribution parameter ), like the a in the Cobb-Douglas 
function, has to do with the relative factor shares in the product. And the parameter p (the 
substitution parameter) —which has no counterpart in the Cobb-Douglas function—is 
what determines the value of the (constant) elasticity of substitution, as will be shown later 
in this section. 

First, however, let us observe that this function is homogeneous of degree one. If we 
replace K and L by jK and jL, respectively, the output will change from Q to 

A[&(jK)~f> + (1 - 8)(jL)~ p ]~ i/p = A[j- <> [bK- p + (1 - 

= U- fi r ]//, Q = jQ 

Consequently, the CES function, like all linearly homogeneous production functions, dis¬ 
plays constant returns to scale, qualifies for the application of Euler’s theorem, and pos¬ 
sesses average products and marginal products that are homogeneous of degree zero in the 
variables K and L, 

Wc may also note that the isoquants generated by the CES production function are 
always negatively sloped and strictly convex for positive values of K and To show this, 
let us first find the expressions for the marginal products Qi and Qk ■ Using the notation 
[• • •] as a shorthand for [<5 K~ p + (1 - 8)L~''], we have 


Q >-^ = A (“) [• ■ - w - ph -'’- 1 

A '+0 

= (1 — 


(1-3) 

A' 1 


Q 

L 



[by (12.63)] 


(12.64) 


and similarly, 



<5 

A? 


Q 

K 



(12.65) 


f K.). Arrow, H. B. Chenery, B. S. Minhas, and R. M. Solow, "Capital-Labor Substitution and Economic 
Efficiency,” Review of Economics and Statistics, August 1961, pp. 225-250. 
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which are defined for positive values of K and L. Thus the slope of isoquants (with K 
plotted vertically and L horizontally) is 


dK 

71 


Qk 


(l-S) 

8 


K 

7 


\-u> 


< 0 


[see (11.36)] (12.66) 


It can then be easily checked that d 2 K/dl 2 > 0 (which wc leave to you as an exercise), 
implying that the isoquants are strictly convex for positive K and L. 

It can also be shown that the CHS production function is strictly quasiconcave for posi¬ 
tive K and L. Further differentiation of (12.64) and (12,65) shows that the second deriva¬ 
tives of the function have the following signs: 


Qll = 


Qkk - 


Qki = 


y _ (1-S)(1 + p) (QY Q,L-Q :(} 
dL A>‘ \ l. I L- 


[Q l L - Q < 0, by Euler’s theorem] 


K\+p)(Q\ p QrK-Q a 
7k Qk = —^\k) 


lQ k K - Q < 0, by Huler’s theorem] 


Q LI j'~ m+p) jgy^>o 


AP 


L / L 


These derivative signs, valid for positive K and L, enable us to check the sufficient condi¬ 
tion for strict quasiconcavity (12.26). As you can verify, 

\8\\ = -Q\ <0 

and IS 2 I = 2QkQi.Qki. ~ Q];Qll ~ Q\Qkk > 0 


Thus the CES function is strictly quasiconcave for positive K and l. 

Last, we shall use the marginal products in (12.64) and (12.65) to find the elasticity of 
substitution of the CES function. To satisfy the least-cost combination condition 
Ql/Qk = Pl/Pk, where fy, and Pk denote the prices of labor service (wage rate) and 
capital service (rental charge for capital goods), respectively, wc must have 


1-8 (K\ [ - p _ P L 
5 \L P k 


[see (12.66)] 


Thus the optimal input ratio is (introducing a shorthand symbol c] 


K * 

77 


i/d-bji 


1 -8 


h 

Pk 


l/(i 


= c 


Pj_ 

Pk 


I/H+/O 


(12.67) 


Taking {K’/L") to be a function 0 ({Pl/Pk), we find the associated marginal and average 
functions to be 


Marginal function = 


d(K'/T) 

<KPl/Pk) 


c /p i \ 1/,l+ ">- 1 

1 + p V Pk ) 


K*/L‘ 

Average function =-= c — 

* Pl/Pk \Pk) 
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Therefore the elasticity of substitution is f 


Marginal function I 
Average function 1 + p 


( 12 . 68 ) 


What this shows is that a is a constant whose magnitude depends on the value of the 
parameter p as follows: 


-I < p < 0 
P = 0 
0 < p < oo 


a > 1 
a = 1 
a < 1 


Cobb-Douglas Function as a Special Case of the CES Function 

In this last result, the middle case of/? = 0 leads to a unitary elasticity of substitution which, 
as wc know, is characteristic of the Cobb-Douglas function. This suggests that the (linearly 
homogeneous) Cobb-Douglas function is a special case of the (linearly homogeneous) CES 
function. The difficulty is that the CES function, as given in (12.63). is undefined when 
p = 0, because division by zero is not possible. Nevertheless, we can demonstrate that, as 
p 0, the CES function approaches the Cobb-Douglas function. 

For this demonstration, we shall rely on a technique known as L’Hopital 1? rule. This rule 

has to do with the evaluation of the limit of a function /(*) = ^ as x -*■ a (where a 

»(*) 

can be either finite or infinite), when the numerator mix) and the denominator nix) either 
(l)both tend to zero as* -* a, thus resulting in an expression of the 0/0 form, or (2) both 
tend to ±oo as * -» a, thus resulting in an expression in the form of oo/oo (or oo/- cc, or 
—co/oo, or -oo/- oc). Even though the limit of /(*) cannot be evaluated as the expres¬ 
sion stands under these two circumstances, its value can nevertheless be found by using the 
formula 


lim 

x-*a 


mix) 

n{x) 


= lim 

x -*a 


m\x) 

n\x) 


[L’Hopital’s rule] 


(12.69) 


Example 3 ^ - * 2 )/(1 - x) as x ->• 1. Here, both m (x) and n (x) approach zero as x 

- approaches unity, thus exemplifying circumstance (1). Since m'(x) = ~2x and n'(x) = -1, 

we can write 


1 — x l —2x 

lim --= lim-= lim 2x = 2 

x->1 1 - X X .1 -1 

This answer is identical with that obtained by another method in Example 2 of Sec. 6.4. 


T Of course, we could also have obtained the same result by first taking the logarithms of both sides 
of (12.67): 


Pl 


K f \ 1 

Ini — = Inc + --In 

L-J ^+p \I>k 


and then applying the formula for elasticity in (10.28), to get 


di\nK‘/r) 1 
d{\nP L /P K )~—p 
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Example 4 


Find the limit of (2x + 5)/(x + 1) asx ->• co. When x becomes infinite, both m(x) and n (x) 
become infinite in the present case; thus we have here an example of circumstance (2). 
Since m'(x) = 2 and n'(x) = 1, we can write 


lim 


2x + 5 

TT'l 


lim - 

X—X> | 


= 2 


Again, this answer is identical with that obtained by another method in Example 3 of 
Sec. 6.4. 


It may turn out that the right-side expression in (12.69) again falls into the 0/0 or 
the cc/oo format, same as the left-side expression. In such an event, we may reapply 
UHopiniPs rule, i.e„ we may look for the limit of m"(x)/n"(x) as x -» a, and take that 
limit as our answer. It may also turn out that even though the given function fix), whose 
limit we wish to evaluate, is originally not in the form of m(x)/n(x) that falls into the 0/0 
or ihe t>o/co format upon limit-raking, a suitable transformation will make fix) amenable 
to the application of the rule in (12.69). This latter possibility can be illustrated by the 
problem of finding the limit of the CE$ function (12.63)—now viewed as a function 
Q(p)— -as p —* 0. 

As given, Q{p) is not in the form of m(p)/n(p). Dividing both sides of (12.63) by A, 
and taking the natural log, however, we do get an expression in that form, namely. 


0 -\n[SK- p +(1 -&)L~ P ] m(p) 

In — =-=- 

A p n{p) 


(12.70) 


Moreover, as p -* 0, we find that m(p) -> - In (3 -I-1 -<>) = - In 1 = 0. and n(p) -»• 0, 
too. Thus L'Hopital’s rule can be used to find the limit of In (Q/A). Once that is done, the 
limit of Q can also be found: since Q/A = e u & A) , so that Q - Ae H(j!A] , it follows that 

lim Q = lim/le 1 "^) - (12.71) 


From (12.70), let us first find m\p) and «'(p), as required by L’Hopital's rule. The latter 
is simply n'(p) — 1. The former is 


-1 


d 


m'(p ) =-[6/f p -f<1 

VM [5K-p + {\-H)lr^dp [ 

-[-&K~ p \nK -(1 -6)1. ' p InL] 

“ [$£-* + (1 -&)L p ] 

By L’llopital’s rule, therefore, we have 


d).L p ] [chain rule] 
[by (10.21')] 


;>—o A n'{p) 


tUn K + (1 - rf) In L 
1 


In (K & L l & ) 


In view of this result, when e is raised to the power of lim \i)(Q/A), the outcome is simply 

0 —0 

Hence, by (12.71), we finally arrive at the result 

lim Q — AK^L 1 ^ 

0 

showing that, as p 0, the CES function indeed tends to the Cobb-Douglas function. 
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EXERCISE 12.7 


1 . Suppose that the isoquants in Fig. 12.9b are derived from a particular homogeneous 
production function Q = Q(a, b). Noting that Of = EE' = E'E", what must be the 
ratios between the output levels represented by the three isoquants if the function Q is 
homogeneous 

(o) of degree one? { b ) of degree two? 

2. For the generalized Cobb-Douglas case, if we plot the ratio b"f<f against the ratio 
P a /Pt>, what type of curve will result? Does this result depend on the assumption that 
a + p = )7 Read the elasticity of substitution graphically from this curve. 

3. is the CES production function characterized by diminishing returns to each input for 
all positive levels of input? 

4. Show that, on an isoquant of the CES function, d 2 K JdL 2 > 0. 

5. (a) For the CES function, if each factor of production is paid according to its marginal 

product, what is the ratio of labor's share of product to capital's share of product? 
Would a larger value of 6 mean a larger relative share for capital? 

(f>) For the Cobb-Douglas function, is the ratio of labor's share to capital's share de¬ 
pendent on the K jL ratio? Does the same answer apply to the CES function? 

6. (a) The CES production function rules out p = -1. If p - -1, however, what would be 

the general shape of the isoquants for positive K and L? 

(b) is a defined for p = -1 ? What is the limit of a as p -*• -1 ? 

(c) Interpret economically the results for parts (a) and (£>). 

7. Show that by writing the CES function as Q = A[SK~ P + (1 - S)L~ p ]~ r/p , where/ - > 0 
is a new parameter, we can introduce increasing returns to scale and decreasing returns 
to scale. 


8 . 


9. 


Evaluate the following: 

, , x 2 -x-U 
(a) lim--— 

x-*4 X-4 


(c) lim 

x—0 


(b) lim 
v ' *->o 


e* -1 


(d) lim 

X-HX X 


y -e* 

X 

Inx 


By u$e of L'HOpital's rule, show that 


(fl) lim — = 0 


{b ) lim x In x = 0 
*-*o + 


(c) lim x x = 1 
o+ 



Further Topics in 
Optimization 


This chapter deals with two major topics. The first is nonlinear programming, which 
extends the techniques of constrained optimization of Chap. 12 by allowing inequality con¬ 
straints into the problem. In Chap. 12, the constraints must be satisfied as strict equalities; 
i.e., the constraints are always binding. Now we shall consider constraints that may not be 
binding in the solution; i.e., they may be satisfied as inequalities in the solution. 

In the second part of this chapter, we revert back to the realm of classical-constrained 
optimization to discuss some topics left untouched iti the previous chapters. These include 
the indirect objective function, the envelope theorem, and the concept of duality. 

13.1 Nonlinear Programming and Kuhn-Tucker Conditions 

In the history of methodological development, the first attempts at dealing with inequality 
constraints were concentrated on linear ones only. With linearity prevailing in the con¬ 
straints as well as in the objective function, the resulting methodology is quite naturally 
christened linear programming. Despite the limitation of linearity, however, we could for 
the first time, explicitly specify the choice variables to be nonnegative, as is appropriate in 
most economic analysis. This represents a significant advance. Nonlinear programming, a 
later development, makes it possible even to handle nonlinear inequality constraints and 
nonlinear objective function. Thus it occupies a most important place in optimization 
methodology. 

In the classical optimization problem, with no explicit restrictions on the signs of ihe 
choice variables, and with no inequalities in the constraints, the first-order condition for 
a relative or local extremum is simply that the first partial derivatives of the (smooth) 
Lagtangian function with respect to all the choice variables and the Lagrange multipliers 
be zero, in nonlinear programming, there exists a similar type of first-order condition, 
known as the Kuhn-Tucker conditions!' As we shall see, however, while the classical lirst- 
order condition is always necessary, the Kuhn-Tucker conditions cannot be accorded the 
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t H. W. Kuhn and A. W. Tucker, "Nonlinear Programming/' in J. Neyman (ed.). Proceedings of the 
Second Berkeley Symposium on Mathematical Statistics and Probability , University of California Press, 
Berkeley, California, 1951, pp. 481-492. 
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FIGURE 13.1 


status of necessary conditions unless a certain proviso is satisfied. On the other hand, under 
certain specific circumstances, the Kuhn-Tucker conditions turn out to be sufficient condi¬ 
tions, or even necessary-and-sufficicnt conditions as well. 

Since the Kuhn-Tuekcr conditions are the single most important analytical result in non¬ 
linear programming, it is essential to have a proper understanding of those conditions as 
well as their implications. For the sake of expository convenience, wc shall develop these 
conditions in two steps. 


Step 1: Effect of Nonnegativity Restrictions 

As the first step, consider a problem with nonnegativity restrictions on the choice variables, 
but with no other constraints. Taking the single-variable case, in particular, we have 


Maximize n = f{x\) 

subject to x\ > 0 


(13.1) 


where the function / is assumed to be differentiable. In view of the restriction ,ri > 0, three 
possible situations may arise. First, if a local maximum of tt occurs in the interior of the 
shaded feasible region in Fig. 13.1, such as point A in Fig. 13.la, then we have an interior 
solution. The first-order condition in this case is d?z/dx\ = f'{x\) = 0, same as in the clas¬ 
sical problem. Second, as illustrated by point B in Fig. 13.1/?, a local maximum can also 
occur on the vertical axis, where *i = 0. Even in this second case, where we have a bound¬ 
ary 1 solution, the first-order condition f\x i) = 0 nevertheless remains valid. However, as a 
third possibility, a local maximum may in the present context take the position of point C 
or point D in Fig. 13.1c, because to qualify as a local maximum in problem (13.1), the can¬ 
didate point merely has to be higher than the neighboring points within the feasible region. 
In view of this last possibility, the maximum point in a problem like (13.1) ean be charac¬ 
terized, not only by the equation f'(x |) = 0, but also by the inequality /'(xi) < 0.Note on 
the other hand, that the opposite inequality f\x\) > 0 can safely be ruled out, for at a point 
where the curve is upward-sloping, we can never have a maximum, even if that point is 
located on the vertical axis, such as point E in Fig. 13.lu. 

The upshot of the preceding discussion is that, in order for a value of x\ to give a local 
maximum ofjr in problem (13.1), it must satisfy one of the following three conditions 


/*(*!) = 0 

and 

x\ > 0 

[point A] 

(13.2) 

./Vi) = 0 

and 

.v 1= 0 

[point S] 

(13.3) 

fix i) < 0 

and 

X) = 0 

[points C and D] 

(13.4) 
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Actually, these three conditions can be consolidated into a single statement 

f'(.Xi)<0 -ti > 0 and .xy/'l.ri) = 0 (13.5) 

The first inequality in (13.5) is a summary of the information regarding f'(x\ ) enumer¬ 
ated in (13.2) through (13.4). The second inequality is a similar summary for ay; in fact, 
it merely reiterates the nonnegativity restriction of the problem. And, as for the third part 
of (13.5), we have an equation which expresses an important feature common to (13.2) 
through (13.4), namely, that of the two quantities .t| and f'(x\) , at leant one must take a zero 
value, so that the product of the two must be zero. This feature is referred to as the compli¬ 
mentary’ slackness between x\ and f'(x \). Taken together, the three parts of f 13.5) constitute 
the first-order necessary condition for a local maximum in a problem where the choice vari¬ 
able must be nonnegative. But going a step further, we can also take them to be necessary for 
a global maximum. This is because a global maximum must also be a local maximum and, 
as such, must also satisfy the necessary condition for a local maximum. 

When the problem contains n choice variables: 

Maximize tt = f(x lt x 2 , ■ - ■, x„) 
subject to Xj > 0 (/ = 1,2,..., n) 

The classical first-order condition f = f 2 = ■•■ = f„ = 0 must be similarly modified. To 
do this, we can apply the same type of reasoning underlying (13.5) to each choice variable 
xj taken by itself. Graphically, this amounts to viewing the horizontal axis in Kig. 13.1 as 
representing each ay in turn. The required modification of the first-order condition then 
readily suggests itself: 

/, < 0 Jt, > 0 and X)j) = 0 (/ = 1,2./i) (13.7) 

where /) is the partial derivative fbr/chy. 


Step 2: Effect of Inequality Constraints 

With this background, we now proceed to the second step, and try to include inequality 
constraints as well. For simplicity, let us first deal with a problem with three choice vari¬ 
ables (n = 3) and two constraints (m = 2): 


Maximize 

-T - f(x |,.V 2 ,X3) 

subject to 

g'(x {: X 2 ,X;,) <lj 


g\x\,x 2 ,-t 3 ) < r 2 

and 



(13-8) 


which, with the help of two dummy variables s\ and .v 2? can be transformed into the equiv¬ 
alent form 


Maximize 

n - f(.x\,xi,x-i) 

subject to 

g'(xi,X2^3) +ij = n 


g 2 (.V|,.v 2 .X3) + ^ = r 2 

and 



(13.8') 
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If the nonnegativity restrictions arc absent, we may, in line with the classical approach, 
form the Lagrangian function: 

Z’ = /(*1,*2.*3) +*l[ri -#W*2>X3) ->*l] 

+ A 2 fa - gVi, x 2 , x 3 ) - s 2 ] (13.9) 

and write the first-order condition as 

; )z' _ nr _ hz ! _ hz ! _ %z ! _ nr _ sz 1 _ 

9a'i 9x2 9x 2 9.?i fis 2 9A.1 9Aj 

But since the x, and s, variables do have to be nonnegative, the first-order condition on 
those variables should be modified in accordance with (13.7). Consequently, we obtain the 
following set of conditions instead: 


HZ' 

T<0 

osi 


Note that the derivatives HZ' [H\, arc still to be set strictly equal to zero. (Why?) 

Each line of (13.10) relates to a different type of variable. But we can consolidate the 
last two lines and, in the process, eliminate the dummy variable s,- from the first-order con¬ 
dition. Inasmuch as HZ'jHs-, = -A,, the second line of (13.10) tells us that we must have 
—A, £ 0, Si > 0, and -s, A, = 0, or equivalently, 

Sj > 0 kj > 0 and s, a, =0 ( 11 . 11 ) 

But the third line—a restatement of the constraints in {13.8')—means that s, =r, - 

g'(Xi,X 2 , * 3 ). By substituting the latter into (13.11), therefore, we can combine The second 
and third lines of (13.10) into 


x j > 0 

and 

3Z' 

Xj Hx - ° 




UXj 


Si > 0 

and 

HZ' 

.^—=0 

(ii; 

(13.10) 






n-g'(xi,x 2 ,X}) > 0 A, > 0 and A/fa - g'(x,, x 2 , Jr 3 )j = 0 

This enables us to express the first-order condition (13.10) in an equivalent form without 
the dummy variables. Using the symbol g' to denote 'dg’/'dxj, we now write 

O 7' n 7/ 

gj; =fj-(hgj +A 2 gy) £ 0 Xj > 0 and x,^ = 0 

J J 

r t -g'(x,,x 2 ,x^) >0 A; > 0 and A,fa -g-'(x 1; x 2 ,x 3 )] = 0 

(13.12) 


These, then, are the Kuhn-Tucker conditions for problem (13.8), or, more accurately, one 
version of the Kuhn-Tucker conditions, expressed in terms of the Lagrangian function 
in (13.9). 

Now that we know the results, though, it is possible to obtain the same set of conditions 
more directly by using a different Lagrangian function. Given the problem (13.9), let us 
ignore the nonnegativity restrictions as well as the inequality signs in the constraints and 
write the purely classical type of Lagrangian function Z: 

2 = + A|fa -g'(x,,x 2 ,X3)] + A 2 fa -g 2 (x,,x 2 ,x.O] (13.13) 
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Then let us do the following: (1) set the partial derivatives 'dZ/Bx, < 0, but dZ/dX, > 0. 
(2) impose nonnegativity restrictions on x, and A,,-, and (3) require complementary 
slackness to prevail between each variable and the partial derivative of Z with respect to 
that variable, that is, require their product to vanish. Since the results of these steps, 
namely, 


dZ 

VXj 


Xj > 0 

and x, — 

J i) Xj 

dZ 



dZ 



A, >0 

and 


(13.14) 


are identical with (13.12), the Kuhn-Tucker conditions are expressible also in terms of the 
Lagrangian function Z (as against Z'). Note that, by switching from Z' to Z, we can not only 
arrive at the Kuhn-Tucker conditions more directly, but also identify the expression 
r, — g'('.T|,X 2 ,.rO—which was left nameless in (13.12)—as the partial derivative BZjB'k,. 
In the subsequent discussion, therefore, we shall only use the (13.14) version of the Kuhn- 
Tucker conditions, based on the Lagrangian function Z. 


Example 1 


If we cast the familiar problem of utility maximization into the nonlinear programming 
mold, we may have a problem with an inequality constraint as follows: 

Maximize U = U(x, y) 

subject to P,x + P y Y < B 

and x, y>Q 

Note that, with the inequality constraint, the consumer is no longer required to spend the 

entire amount B. 

To add a new twist to the problem, however, let us suppose that a ration has been im¬ 
posed on commodity x equal to Xo- Then the consumer would face a second constraint, and 
the problem changes to 

Maximize 
subject to 

and 

The Lagrangian function is 

Z = U(x, y) + 4i(6 - P>x - P v y) + A 2 (X 0 - x) 
and the Kuhn-Tucker conditions are 


z. 

= U« - P x k\ - A-2 < 0 

O 

^ 1 

* 

and 

xZ* = 0 

Zy 

= Uy~ PyX] < 0 

y > 0 

and 

yz y = o 


= 8- P,y- P y y> 0 

A1 > 0 

and 

AiZ Ai = 0 

Zx 2 

= X 0 - x > 0 

A 2 > 0 

and 

42 Z >2 = 0 


It is useful to examine the implications of the third column of the Kuhn-Tucker condi¬ 
tions. The condition A] Z M = 0, in particular, requires that 

ki(B-P f x-P y y) = 0 


U = U(x,y ) 
PxX+ P y ys B 
x<X 0 
x,y> 0 
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Therefore, we must have either 

Ai = 0 or B - P*x - P y y = 0 

If we interpret Ai as the marginal utility of budget money (income), and if the budget con¬ 
straint is nonbinding (satisfied as an inequality in the solution, with money left over), the 
marginal utility of B should be zero (Ai = 0). 

Similarly, the condition A 2 Z], 2 = 0 requires that either 

A 2 = 0 or Xo - x - 0 

Since A 2 can be interpreted as the marginal utility of relaxing the constraint, we see that if 
the ration constraint is nonbinding, the marginal utility of relaxing the constraint should be 
zero (A 2 = 0). 

This feature, referred to as complementary slackness, plays an essential role in the search 
for a solution. We shall now illustrate this with a numerical example; 

Maximize U —xy 

subject to x + y < 100 

x < 40 

and x, y > 0 

The Lagrangian is 

Z = xy+ Ai(1O0 -x - y) + A 2 (40 - x) 
and the Kuhn-Tucker conditions become 


lx = y- h - /-2 < 0 

x > 0 

and 

0 

H 

Kj 

* 

Zy = X - X] <0 

y> 0 

and 

yZy = 0 

Z /A = 1 00 - x - Y > 0 

ai >0 

and 

0 

II 

£ 

Z >2 = 40 - x > 0 

A 2 >0 

and 

hZi 2 = 0 


To solve a nonlinear programming problem, the typical approach is one of trial and 
error. We can, for example, start by trying a zero value for a choice variable. Setting a vari¬ 
able equal to zero always simplifies the marginal conditions by causing certain terms to 
drop out. If appropriate nonnegative values of Lagrange multipliers can then be found that 
satisfy all the marginal inequalities, the zero solution will be optimal. If, on the other hand, 
the zero solution violates some of the inequalities, then we must let one or more choice vari¬ 
ables be positive. For every positive choice variable, we may, by complementary slackness, 
convert a weak inequality marginal condition into a strict equality. Properly solved, such an 
equality will lead us either to a solution, or to a contradiction that would then compel us to 
try something else. If a solution exists, such trials will eventually enable us to uncover it. We 
can also start by assuming one of the constraints to be nonbinding. Then the related 
Lagrange multiplier will be zero by complementary slackness and we have thus eliminated 
a variable. If this assumption leads to a contradiction, then we must treat the said constraint 
as a strict equality and proceed on that basis. 

For the present example, it makes no sense to tryx = 0 or y= 0, for then we would have 
U = xy = 0. We therefore assume both x and y to be nonzero, and deduce Z r = Z y = 0 
from complementary slackness. This means 


so that 


y - Ai — A 2 = x — Ai (= 0) 
y~ A2 = x. 
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Now, assume the ration constraint to be nonbinding in the solution, which implies that 
\ 2 = 0. Then we have x = y, and the given budget B = 100 yields the trial solution 
x = y = 50. But this solution violates the ration constraint x < 40. Hence we must adopt 
the alternative assumption that the ration constraint is binding with x* = 40. The budget 
constraint then allows the consumer to have y* = 60. Moreover, since complementary 
slackness dictates that Z, = Z y = 0, we can readily calculate that x; = 40, and = 20. 

Interpretation of the Kuhn-Tucker Conditions 

Parts of the Kuhn-Tucker conditions (13.14) arc merely a restatement of certain aspects of 
the given problem. Thus the conditions x, > 0 merely repeat the nonnegativity restrictions, 
and the conditions OZ/iIa, > 0 merely reiterate the constraints. To include these in (13.14), 
however, has the important advantage of revealing more dearly the remarkable symmetry 
between the two types of variables, xj (choice variable) and X, (Lagrange multipliers). To 
each variable in each category, there corresponds a marginal condition— bZ/'dx, < 0 or 
bZ/bkj > 0- -to be satisfied by the optimal solution. Each of the variables must be non¬ 
negative as well. And, finally, each variable is characterized by complementary slackness in 
relation to a particular partial derivative of the Lagrangian function Z. This means that, for 
each Xj, we must find in the optimal solution that either the marginal condition holds as an 
equality, as in the classical context, or the choice variable in question must take a zero 
value, nr both. Analogously, for each X,, we must find in the optimal solution that either 
the marginal condition holds as an equality—meaning that the (th constraint is exactly 
satisfied- -or the Lagrange multiplier vanishes, or both. 

An even more explicit interpretation is possible when we look at the expanded expres¬ 
sions for dZ/dXj and bZ/Sk, in (13.14). Assume the problem to be the familiar production 
problem. Then we have 

fj — marginal gross profit of/th product 

X, = shadow price of /th resource (the opportunity cost of using a unit of the 
/th resource) 

g 1 . = amount of /th resource used up in producing the marginal unit of/th product 
X,gj = marginal imputed cost of /th resource incurred in producing a unit of 
/th product 

Y' kjgj = aggregate marginal imputed cost of/th product 
Thus the marginal condition 


az 

ax, 




requires that the marginal gross profit of the /th product be no greater than its aggregate 
marginal imputed cost; i.e., no wnc/mmpuUition is permitted. The complementary- 
slackness condition then means that, if the optimal solution calls for the active production 
of the /th product (xj > 0), the marginal gross profit must be exactly equal to the aggregate 
marginal imputed cost (‘dZ/Bx* = 0), as would be the situation in the classical optimiza¬ 
tion problem. If, on the other hand, the marginal gross profit optimally falls short of the ag¬ 
gregate imputed cost (SZ/dxj < 0), entailing excess imputation, then that product must 
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not be produced {x* = 0). f This latter situation is something that can never occur in the 
classical context, for if the marginal gross profit i s less than the marginal imputed cost, then 
the output should in that framework be reduced all the way to the level where the marginal 
condition is satisfied as an equality. What causes the situation of 9Z/3X* < 0 to qualify as 
an optimal one here, is the explicit specification of nonnegativity in the present framework. 
For then the most we can do in the way of output reduction is to lower production to the 
level x* — 0, and if we still find dZ/dxJ < 0 at the zero output, we stop there anyway. 

As for the remaining conditions, which relate to the variables X„ their meanings arc even 
easier to perceive. First of all, the marginal condition 9Z/9A,- > 0 merely requires the firm to 
stay within the capacity limitation of every resource in the plant. The complementary-slackness 
condition then stipulates that, if the /th resource is not fully used in the optimal solution 
(3Z/3A* > 0), the shadow price of that resource—which is never allowed to be negative- - 
must be set equal to zero (X* — 0).On the other hand, if a resource has a positive shadow price 
in the optimal solution (X* > 0), then it is perforce a fully utilized resource (9Z/3A* = 0). 

It is also possible, of course, to take the Lagrange-multiplier value X* to be a measure 
of how the optimal value of the objective function reacts to a slight relaxation of the ith 
constraint. In that light, complementary slackness would mean that, if the ith constraint is 
optimally not binding (9Z/3A* > 0), then relaxing that particular constraint will not affect 
the optimal value of the gross profit (A* = 0)—just as loosening a belt which is not con¬ 
stricting one’s waist to begin with will not produce any greater comfort. If, on the other 
hand, a slight relaxation of the ith constraint (increasing the endowment of the ith resource) 
does increase the gross profit (a* > 0), then that resource constraint must in fact be bind¬ 
ing in the optimal solution (9Z/9A’ = 0). 


The n-Variable, m-Constraint Case 


The preceding discussion can be generalized in a straightforward manner to when there are 
n choice variables and m constraints. The Lagrangian function Z will appear in the more 
general form 


m 

Z = /(jfi,* 2 , + £%[/•,- -g'(xux 2 .*„)] (13.15) 


And the Kuhn-Tuckcr conditions will simply be 


3Z 

< 0 

dXj 

xj > 0 

and 

3Z 

xj— = 0 

dxj 

[maximization] 

dz 

9a, 

A,- >0 

and 

dZ 

A,— =0 

3a,- 

(i = 1,2, ...,m 

\j= 1,2, 


(13.16) 


Here, in order to avoid a cluttered appearance, wc have not written out the expanded 
expressions for the partial derivatives SZ/dXj and 3Z/9A,. But you are urged to write them 
out for a more detailed view of the Kuhn-Tucker conditions, similar to what was given in 
(13.14). Note that, aside from the change in the dimension of the problem, the Kuhn-Tucker 
conditions remain entirely the same. The interpretation of these conditions should naturally 
also remain the same. 


* Remember that, given the equation ab = 0, where a and b are real numbers, we can legitimately 
infer that o 0 implies b = 0, but it is not true that o = 0 implies b ^ 0, since b = 0 is also consistent 
with o = 0. 
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What if the problem is one of minimization'.' One way of handling it is to convert the 
problem into a maximization problem and then apply (13.6). To minimize C is equivalent 
to maximizing -C, so such a conversion is always feasible. But we must, of course, also re¬ 
verse the constraint inequalities by multiplying every constraint through by -1. Instead of 
going through the conversion process, however, we may—again using the Lagrangian func¬ 
tion Z as defined in (13.15)—directly apply the minimization version of the Kuhn-Tucker 
conditions as follows: 


3Z 

d 7 j-° 

x- : > 0 

and 

f)Z 

x 'ty =0 

[minimization] 

3Z 

— < o 

o 

A 

«-< 

and 

az 

Xi — = 0 

i i = 1,2, ...,m 

3a,• 



a*, 

\i = 1.2 . n 


(13.17) 


This you should compare with (13.16). 

Reading (13.16) and (13.17) horizontally (rowwise), we see that the Kuhn-Tucker condi¬ 
tions for both maximization and minimization problems consist of a set of conditions relating 
to the choice variables x, (first row) and another set relating to the Lagrange multipliers a, 
(second row). Reading them vertically (columnmse) on the other hand, we note that, for each 
Xj and X,, there is a marginal condition (first column), a nonnegativity restriction (second 
column), and a complementary-slackness condition (third column), In any given problem, 
the marginal conditions pertaining to the choice variables always differ, as a group, from the 
marginal conditions for the Lagrange multipliers in the sense of inequality they take. 

Subject to the proviso to be explained in Sec. 13.2, the Kuhn-Tucker maximum condi¬ 
tions (13.16) and minimum conditions (13.17) are necessary conditions for a local maxi¬ 
mum and local minimum, respectively. But since a global maximum (minimum) must also 
be a local maximum (minimum), the Kuhn-Tucker conditions can also be taken as neces¬ 
sary conditions for a global maximum (minimum), subject to the same proviso. 


Example 2 


Let us apply the Kuhn-Tucker conditions to solve a minimization problem: 

Minimize C = (xi — 4) 2 + (x 2 - 4) 2 
5ubjectto 2xi + 3x2 > 6 

-3xi -2x2 > -12 
and x 5f x 2 > 0 

The Lagrangian function for this problem is 

Z = (xi - 4) 2 + (x 2 - 4) 2 + ai (6 - 2xi - 3x 2 ) + a 2 (-1 2 + 3xi + 2x 2 ) 


Since the problem is one of minimization, the appropriate conditions are (13.1 7), which 
include the four marginal conditions 

az 

Hx] 

8Z 
3*2 

az 

9/.1 

8Z 
3A-2 

plus the non negativity and complementary-slackness conditions. 


= 2(xi — 4) — 2 ai + 3A. 2 > 0 
= 2(x 2 — 4) — 31i + 2X 2 > 0 
- 6 - 2xi - 3x 2 < 0 
= -12 + 3*i +2*2 < 0 


(13.18) 
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To find a solution; we again use the trial-and-error approach, realizing that the first few 
trials may lead us into a blind alley. Suppose we first try Ai > 0 and kz > 0 and check 
whether we can find corresponding xi and x 2 values that satisfy both constraints. With 
positive Lagrange multipliers, we must have dZ/Bk] = dZjdk 2 = 0. From the last two lines 
of (13.18), we can thus write 


2*i + 3x2 = 6 


and 


3xi + 2x 2 = 12 


4 1 

These two equations yield the trial solution xi = 4- and x 2 = -1 which violates the 
nonnegativity restriction on x 2 . 

Let us next try xi > 0 and x 2 > 0, which would imply ’SZ/'dX] = dZ/'dxz = 0 by comple¬ 
mentary slackness. Then, from the first two lines of (13.18), we can write 


2{X] — 4) — 2Ai +3/o = 0 and 2 (* 2 -4)- 3 h -lk 2 = 0 (13.19) 

Multiplying the first equation by 2, and the second equation by 3, then subtracting the lat¬ 
ter from the former, we can eliminate k 2 and obtain the result 


4xi - 6 x 2 + 5Ai + 8-0 


By further assuming k\ = 0, we can derive the following relationship between xi and x 2 : 

Jfi - = -2 (13.20) 

In order to solve for the two variables, however, we need another relationship between xi 
and X 2 , For this purpose, let us assume that ^ 0, so that dZ/fth = 0. Then, from the last 
two lines of (13.18), we can write (after rearrangement) 

+2x 2 = 12 (13.21) 


Together, (13.20) and (13.21) yield another trial solution 

> 0 


28 / , 2 

* 1= i3 =2 T3 


36/ JO 
* 2 = T3 = 2 T3 


> 0 


Substituting these values into (13.19), and solving for the Lagrange multipliers, we get 


A1 = 0 


A2 


16 

V3 



> 0 


Since the solution values for the four variables are all nonnegative and satisfy both con¬ 
straints, they are acceptable as the final solution. 


EXERCISE 13.1 

1, Draw a setof diagrams similar to those in Fig. 13.1 for the minimization case, and deduce 
a set of necessary conditions for a local minimum corresponding to (13.2) through 
(13.4). Then condense these conditions into a single statement similar to (13.5). 

2. (a) Showthat,in (13.16), instead of writing 

0 = 1 . m) 

ok; 

as a set of m separate conditions, it is sufficient to write a single equation in the 
form of 
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(b) Can we do the same for the following set of conditions? 

*j ~~~ = 0 

OXj 

3. Based on the reasoning used in Prob. 2, which set (or sets) of conditions in (13.17) can 
be condensed into a single equation? 

4. Suppose the problem is 

Minimize C = f(x\,x i,x„) 

subject to g'(X],X 2 ,...,x„)>ri 


and 


Xj > 0 


/ = 

/ = 1 < 2- n) 


Write the Lagrangian function, take the derivatives dZ/dx, and tiZ/H/,, and write out 
the expanded version of the Kuhn-Tucker minimum conditions (13.17). 

5. Convert the minimization problem in Prob. 4 into a maximization problem, formulate 
the Lagrangian function, take the derivatives with respect to Xj and A,, and apply the 
Kuhn-Tucker maximum conditions (13.16). Are the results consistent with those 
obtained in Prob. 4? 


13.2 The Constraint Qualification __ 

The Kuhn-Tucker conditions arc necessary conditions only if& particular proviso is satis¬ 
fied. That proviso, called the constraint qualification, imposes a certain restriction on the 
constraint functions of a nonlinear programming problem, for the specific purpose of rul¬ 
ing out certain irregularities on the boundary of the feasible set. that would invalidate the 
Kuhn-Tucker conditions should the optimal solution occur there. 

Irregularities at Boundary Points 

Let us first illustrate the nature of such irregularities by means of some concrete examples. 

Maximize ,t = xi 

subject to *2 - (1 - Xi) 3 < 0 
and X], x 2 > 0 

As shown in Fig. 13.2, the feasible region is the set of points that lie in the first quadrant 
on or below the curve X 2 = (1 - *i) 5 . Since the objective function directs us to maximize 
x|, the optimal solution is the point (1, 0). But the solution fails to satisfy the Kuhn-Tucker 
maximum conditions. To check this, we first write the Lagrangian function 

Z — X-\ + m[-* 2 + (1 - Xi) J ] 

As the first marginal condition, we should then have 

= 1 - 3Ai (1 - Xi )^ < 0 
OX] 

In fact, since - 1 is positive, complementary slackness requires that this derivative vanish 
when evaluated at the point (1, 0). However, the actual value we get happens to be 
i) Z/3x* = 1, thus violating the given marginal condition. 


Example 1 
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The reason for this anomaly is that the optimal solution (1, 0) occurs in this example at 
an outward-pointing cusp, which constitutes one type of irregularity that can invalidate the 
Kuhn-Tucker conditions at a boundary optimal solution. A cusp is a sharp point formed 
when a curve takes a sudden reversal in direction, such that the slope of the curve on one 
side of the point is the same as the slope of the curve on the other side of the point. Here, 
the boundary of the feasible region at first follows the constraint curve, but when the point 
(1,0) is reached, it takes an abrupt turn westward and follows the trail of the horizontal axis 
thereafter. Since the slopes of both the curved side and the horizontal side of the boundary 
are zero at the point (1, 0), that point is a cusp. 

Cusps are the most frequently cited culprits for the failure of the Kuhn-Tucker conditions, 
but the truth is that the presence of a cusp is neither necessary nor sufficient to cause those 
conditions to fail at an optimal solution. Examples 2 and 3 will confirm this. 


Example 2 


To the problem of Example 1, let us add a new constraint 

2x^ +xz<2 


whose border, x? = 2 - 2xi, plots as a straight line with slope -2 which passes through the 
optimal point in Fig. 13.2. Clearly, the feasible region remains the same as before, and so 
does the optimal solution at the cusp. But if we write the new Lagrangian function 

Z = x i 4- A.1 [-*2 + (1 - Xi)^] + ).2[2 - 2X] — X 2 ] 


and the marginal conditions 


9Z 

t)X] 


= 1 - 3 a 1 (1 - x ,) 2 - 2;. 2 <0 


az 

ri*2 


= -A1 


li < 0 


— = -X 2 + 0 - x,) 3 > 0 

c/ai 

= 2 - 2xi - X 2 > 0 
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Example 3 


FIGURE 13.3 


it turns out that the values = 1, x\ = 0 ,= 1, and ).\ = \ do satisfy these four inequal¬ 
ities, as well as the nonnegativity and complementary-slackness conditions. As a matter of 
fact, A* can be assigned any nonnegatfve value (not just 1), and all the conditions can still 
be satisfied—which goes to show that the optimal value of a Lagrange multiplier is not 
necessarily unique. More importantly, however, this example shows that the Kuhn-Tucker 
conditions can remain valid despite the cusp. 

The feasible region of the problem 

Maximize ji = x 2 

subject to —(^10 — xf - x 2 j <0 

£ -2 

and xi, *2 > 0 

as shown in Fig. 13.3, contains no cusp anywhere. Yet, at the optimal solution, (2, 6), the 
Kuhn-Tucker conditions nonetheless fail to hold. For, with the Lagrangian function 

Z = x 2 xf + Ai(lQ - *1 - x 2 J +A 2(-2 + xi) 

the second marginal condition would require that 

Indeed, since x£ is positive, this derivative should vanish when evaluated at the point (2,6). 
But actually we get 3Z/9x 2 - 1 / regardless of the value assigned to a.i. Thus the Kuhn- 
Tucker conditions can fail even in the absence of a cusp—nay, even when the feasible region 
is a convex set as in Fig. 13.3. The fundamental reason why cusps are neither necessary nor 
sufficient for the failure of the Kuhn-Tucker conditions is that the preceding irregularities 
referred to before relate, not to the shape of the feasible region per se, but to the forms of 
the constraint functions themselves. 
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The Constraint Qualification 

Boundary irregularities—cusp or no cusp will not occur if a certain constraint qualifica¬ 
tion is satisfied. 

To explain this, Ictx* = (*“,*,*,.be a boundary point of the feasible region and 
a possible candidate fora solution, and letc/x = (dx\, dx 2 ,..., dx n ) represent a particular 
direction of movement from the said boundary point. The direction-of-movemenl interpre¬ 
tation of the vector dx is perfectly in line with our earlier interpretation of a vector as a 
directed line segmenl (an arrow), but here, the point of departure is the point x* instead of 
the point of origin, and so the vector dx is not in the nature ol'a radius vector. We shall now 
impose two requirements on the vector dx. First, if theyth choice variable has a zero value 
at the point*", then we shall only permit a nonnegative change on ihe.xy axis, that is, 

dxj> 0 if *; = () (13.22) 

Second, if the ith constraint is exactly satisfied at the point *", then we shall only allow val¬ 
ues of dx i ,...,dx„ such that the value of the constraint function g'(x') will not increase 
(for a maximization problem) or will not decrease (for a minimization problem), that is. 


dg‘(x*)= gj dx | + g‘ 2 dx 2 + "■ + #!, dx n 


<0(max.) 
> 0 (min.) 


if g{x‘)=r ; 


(13.23) 


where all the partial derivatives of gj arc to be evaluated at **. If a vector dx satisfies 
(13.22) and (13.23), we shall refer to it as a test vector. Finally, if there exists a differen¬ 
tiable arc that (I) emanates from the point (2) is contained entirely in the feasible 
region, and (3) is tangent to a given test vector, we shall call it a qualifying arc for that test 
vector. With this background, the constraint qualification can be staled simply as follows: 


The constraint qualification is satisfied if, for any point x* on the boundary of the feasible 
region, there exists a qualifying arc for every test vector dx. 


Example 4 We shall show that the optimal point (1, 0) of Example 1 in Fig. 13.2, which fails the Kuhn- 

- Tucker conditions, also fails the constraint qualification. At that point, x 2 = 0; thus the test 

vector must satisfy 

dx 2 > 0 [by (13.22)] 

Moreover, since the (only) constraint, g 1 = x 2 - (1 - x, )•’ < 0, is exactly satisfied at (1, 0), 
we must let [by (13.23)] 

g] dx i + g\ dx 2 - 3(1 - xf) 2 dx, + dx 2 = dx 2 < 0 

These two requirements together imply that we must let dx 2 = 0. In contrast, we are free 
to choose dx, . Thus, for instance, the vector (dx i, dx 2 ) = (2,0) is an acceptable test vector, 
as is (dxi, dx 2 ) = (-1,0). The latter test vector would plot in Fig. 13.2 as an arrow starting 
from (1, 0) and pointing in the due-west direction (not drawn), and it is dearly possible to 
draw a qualifying arc for it. (The curved boundary of the feasible region itself can serve as a 
qualifying arc.) On the other hand, the test vector (dx,, dx 2 ) = (2,0) would plot as an 
arrow starting from (1,0) and pointing in the due-east direction (not drawn). Since there is 
no way to draw a smooth arc tangent to this vector and lying entirely within the feasible 
region, no qualifying arcs exist for it. Flence the optimal solution point (1, 0) violates the 
constraint qualification. 
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Example 5 


Example 6 


FIGURE 13.4 


Referring to Example 2, let us illustrate that, after an additional constraint 2*i - 1 -X 2 < 2 is 
added to Fig. 13.2, the point (1, 0) will satisfy the constraint qualification, thereby revali¬ 
dating the Kuhn-Tucker conditions. 

As in Example 4, we have to require dx 2 > 0 (because x 2 = 0) and dxi < 0 (because the 
first constraint is exactly satisfied); thus, dx 2 = 0. But the second constraint is also exactly 
satisfied, thereby requiring 

gfdxi + gj dx 2 = 2dx 1 4 dxi = 2dx\ <0 [by (13.23)] 

With nonpositive dx 1 and zero dx 2 , the only admissible test vectors—aside from the null 
vector itself—are those pointing in the due-west direction in Fig. 13.2 from (1, 0). All of 
these lie along the horizontal axis in the feasible region, and it is certainly possible to draw 
a qualifying arc for each test vector. Hence, this time the constraint qualification indeed is 
satisfied. 

Linear Constraints 

Earlier, in Example 3, it was demonstrated that the convexity of the feasible set does not 
guarantee the validity of the K.uhn-Tucker conditions as necessary conditions. However, if 
the feasible region is a convex set formed by linear constraints only, then the constraint 
qualification will invariably he mot, and the Kuhn-Tuckcr conditions will always hold at an 
optimal solution. This being the case, we need never worry about boundary irregularities 
when dealing with a nonlinear programming problem with linear constraints. 

Let us illustrate the linear-constraint result in the two-variable, two-constraint framework. 
For a maximization problem, the linear constraints can be written as 

011*1 + 0i2*2 5 n 

0 2 l K\ + 022*2 5 n 

where we shall take all the parameters to be positive. Then, as indicated in Fig. 13.4, the first 
constraint border will have a slope of -on jQ\i < 0, and the second, a slope of - 021/022 < 0. 
The boundary points of the shaded feasible region fall into the following five types: (1) the 
point of origin, where the two axes intersect, (2) points that lie on one axis segment, such 
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as/and S, (3) points at the intersection of one axis and one constraint border, namely, Kand 
ft, (4) points lying on a single constraint border, such as L and N, (5) the point of intersec¬ 
tion of the two constraints, M. We may briefly examine each type in turn with reference to 
the satisfaction of the constraint qualification. 

1. At the origin, no constraint is exactly satisfied, so we may ignore (13.23). But since 
Xi = x 2 = 0, we must choose test vectors with dx i > 0 and dx2 > 0, by (13.22). Hence 
all test vectors from the origin must point in the due-east, due-north, or northeast direc¬ 
tions, as depicted in Fig. 13.4. These vectors all happen to fall within the feasible set, and 
a qualifying arc clearly can be found for each. 

2 . At a point like /, we can again ignore (13.23). The fact that x 2 = 0 means that we must 
choose dx2 > 0, but our choice of dx-\ is free. Hence all vectors would be acceptable ex¬ 
cept those pointing southward ( dx 2 < 0). Again all such vectors fall within the feasible 
region, and there exists a qualifying arc for each. The analysis of point 5 is similar. 

3. At points K and R, both (13.22) and (13.23) must be considered. Specifically, at K, we have 
to choose dx 2 > 0 since x 2 — 0, so that we must rule out all southward arrows. The second 
constraint being exactly satisfied, moreover, the test vectors for point K must satisfy 

g^dxt+gl dx 2 = a 2 ] c/*, + 022 dx 2 < 0 03.24) 

Since at K we also have o 2 1 x, + 022*2 = r 2 (second constraint border), however, we may 
add this equality to (13.24) and modify the restriction on the test vector to the form 

a 2 \(x 1 + dx]) + an(x 2 + dx 2 ) < r 2 (13.24') 

Interpreting +dxj) to be the new value of x/ attained at the arrowhead of a test 
vector, we may construe (1 3.24') to mean that all test vectors must have their arrow¬ 
heads located on or below the second constraint border. Consequently, all these vectors 
must again fall within the feasible region, and a qualifying arc can be found for each. The 
analysis of point R is analogous. 

4. At points such as L and I V, neither variable is zero and (13.22) can be ignored. However, 
for point N, (13.23) dictates that 

g] dx 1 + g\ dx 2 = on dx 1 +a] 2 dx 2 <0 (13.25) 

Since point N satisfies an dx 1 + 012 dx 2 = i] (first constraint border), we may add this 
equality to (13.25) and write 

fln(xi +dx 1 ) + o u (x 2 + dx 2 ) < n (13.25') 

This would require the test vectors to have arrowheads located on or below the first con¬ 
straint border in Fig. 13.4, Thus we obtain essentially the same kind of result 
encountered in the other cases. This analysis of point L is analogous. 

5. At point M, we may again disregard (13.22), but this time (13.23) requires all test vec¬ 
tors to satisfy both (13.24) and (13.25). Since we may modify the latter conditions to the 
forms in (13.24') and (13.25'), all test vectors must now have their arrowheads located 
on or below the first as well as the second constraint borders. The result thus again 
duplicates those of the previous cases. 

In this example, it so happens that, for every type of boundary point considered, the test 
vectors all iie within the feasible region. While this locational feature makes the qualifying 
arcs easy to find, it is by no means a prerequisite for their existence. In a problem with a 
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nonlinear constraint border, in particular, the constraint border itself may serve as a qualify¬ 
ing arc for some test vector that lies outside of the feasible region. An example of this can 
be found in one of the problems below. 


EXERCISE 13.2 

1. Check whether the solution point (xf, x\) = (2,6) in Example 3 satisfies the constraint 
qualification. 

2. Maximize n = *i 

subject to x, 2 1 - < 1 

and *i, *2 - 0 

Solve graphically and check whether the optimal-solution point satisfies (a) the con¬ 
straint qualification and (b) the Kuhn-Tucker conditions. 

3. Minimize C = x\ 

subject to xf - *2 - 0 

and xi, *2 > 0 

Solve graphically. Does the optimal solution occur at a cusp? Check whether the opti¬ 
mal solution satisfies (a) the constraint qualification and (b) the Kuhn-Tucker minimum 
conditions. 

4. Minimize C = Xi 

subject to -X 2 - (1 - x^ 3 > 0 

and xi,X 2>0 

Show that (a) the optimal solution (x*, xj) = (1, 0) does not satisfy the Kuhn-Tucker 
conditions, but (b) by introducing a new multiplier k 0 > 0, and modifying the 
Lagrangian function (13,15) to the form 

m 

Zo = k 0 f{x 1 , x 2 , • •x n ) + ^Ai[n - <j(x h x 2f ..., *„)] 

the Kuhn-Tucker conditions can be satisfied at (1, 0). (Note: The Kuhn-Tucker condi- 
tions on the multipliers extend to only Xi,..., k m , but not to ko.) 


13.3 Economic Applications __ 

War-Time Rationing 

Typically during rimes of war the civilian population is subject to some form of rationing 
of basic consumer goods. Usually, the method of rationing is through the use of redeemable 
coupons used by the government. The government will supply each consumer with an 
allotment of coupons each month. In turn, the consumer will have to redeem a certain num¬ 
ber of coupons at the time of purchase of a rationed good. This effectively means the con¬ 
sumer pays two prices at the time of the purchase. I le or she pays both the coupon price and 
the monetary price of the rationed good. This requires the consumer to have both sufficient 
funds and sufficient coupons in order to buy a unit of the rationed good. 

Consider the case of a two-good world where both goods, .v and u, are rationed. Let the 
consumer’s utility function be U = Liix, y). The consumer has a fixed money budget of B 
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and faces exogenous prices P, and P r Further, the consumer has an allotment of coupons, 
denoted C, which can be used to purchase either x or y at a coupon price of c x and c,.. 
Therefore the consumer’s maximization problem is 

Maximize J = U(x,y ) 

subject to P : x + P vV < H 

c ( x + c y y < C 
and .x,y>0 

The Lagrangian for the problem is 


Z = l'(x, >) + ).\{B - P ( x - P y y ) + A?(C - c,x + c v y) 


where A, and A 2 are the Lagrange multipliers. Since both constraints are linear, the con¬ 
straint qualification is satisfied and the Kuhn-Tucker conditions arc necessary: 


Z x = J x - X\P x - XjCi <0 x > 0 

Z v = U v - X\ P y - aic v < 0 y > 0 

Zx t = B - P x x - P } y >0 A| > 0 

7.x. — C - c,x - t\y >0 A? > 0 


xZ> =0 
y Zy = 0 
A] Z;., = 0 
At Zy 2 = 0 


Example 1 


Suppose the utility function is of the form U = xy 2 . Further, let B = 100 and = P y = 1 
while C = 120, c, = 2, and c y = 1. 

The Lagrangian takes the specific form 


Z = xy 2 +Ai(100- a - y) + A 2 (12Q - 2x - y) 


The Kuhn-Tucker conditions are now 


Z x 

= y 2 - A, - 2A 2 < 0 

x > 0 

*Zx 

= 0 

Zy 

= 2xy- A 1 - A 2 < 0 

o 

AI 

ily 

= 0 


o 

o 

X 

IV 

o 

O 

AI 

Al Z;., 

= 0 

Zx, 

= 120 - 2x - y > 0 

A 2 > 0 

A 2 Zy 2 

= 0 


Again, the solution procedure involves a certain amount of trial and error. We can first 
choose one of the constraints to be nonbinding and solve for x and y. Once found, use 
these values to test if the constraint chosen to be nonbinding is violated. If it is, then redo 
the procedure choosing another constraint to be nonbinding. If violation of the nonbind¬ 
ing constraint occurs again, then we can assume both constraints bind and the solution is 
determined only by the constraints. 

Step 1: Assume that the second (ration) constraint is nonbinding in the solution, so that 
A 2 = 0 by complementary slackness. But let x, y, and Ai be positive so that complementary 
slackness would give us the following three equations: 

Z, = y 2 - A] = 0 
Z y = 2xy - Ai = 0 
Zi, = 100- x - y= 0 
Solving for x and y yields a trial solution 

x = 33V 3 y = 66 2 /3 
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However, when we substitute these solutions into the coupon constraint we find that 

2(3373) + 66 2 / 3 = m 1 /? > 120 

This solution violates the coupon constraint, and must be rejected. 

Step 2: Now let us reverse the assumptions on 4 and 4 so that 4 = 0, but let 
X 2 > x, y > 0. Then, from the marginal conditions, we have 

Z x = y 2 - 2 k 2 = 0 

Z y = 2xy-ki = 0 

4, =120-2* - y=0 

Solving this system of equations yields another trial solution 

* = 20 y= 80 

which implies that \ 2 - 2xy = 3,200. These solution values, together with 4 = 0, satisfy 
both the budget and ration constraints. Thus we can accept them as the final solution to 
the Kuhn-Tucker conditions. 

This optimal solution, however, contains a curious abnormality. With the budget con¬ 
straint binding in the solution, we would normally expect the related Lagrange multiplier to 
be positive, yet we actually have 4 = 0. Thus, in this example, while the budget constraint 
is mathematically binding (satisfied as a strict equality in the solution), it is economically non¬ 
binding (not calling for a positive marginal utility of money). 

Peak-Load Pricing 

Peak and off-peak pricing and planning problems are commonplace for firms with capacity- 
constrained production processes. Usually the firm has invested in capacity in order to 
targetaprimary market. However there may exist a secondary market in which the firm can 
often sell its product. Once the capital equipment has been purchased to service the firm’s 
primary market, it is freely available (up to capacity) to be used in the secondary market. 
Typical examples include schools and universities that build to meet daytime needs (peak), 
but may offer night-school classes (off-peak); theaters that offer shows in the evening 
(peak) and matinees (off-peak); and trucking companies that have dedicated routes but 
may choose to enter “back-haul'' markets. Since the capacity cost is a factor in the profit- 
maximizing decision for the peak market and is already paid, it normally should not be a 
factor in calculating optimal price and quantity' for the smaller, off-peak market. However, 
if the secondary market's demand is close to the same size as the primary market, capacity 
constraints may be an issue, especially since it is a common practice to price discriminate 
and charge lower prices in off-peak periods. Bven though the secondary market is smaller 
than the primary, it is possible that, at the lower (profit-maximizing) price, off-peak demand 
exceeds capacity. In such eases capacity' choices must be made taking both markets into 
account, making the problem a classic application of nonlinear programming. 

Consider a profit-maximizing company that faces two average-revenue curves 

Pi — P l (Q\) in the day time (peakperiod) 

P 2 = P 2 ( Q 2 ) in the night time (off-peak period) 

To operate, the firm must pay b per unit of output, whether it is day or night. Furthermore, 
the firm must purchase capacity at a cost ofc per unit of capacity. Let K denote total capacity 
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measured in units of Q. The firm must pay for capacity, regardless of whether it operates in 
the off-peak period. Who should be charged for the capacity costs: peak, off-peak, or both 
sets of customers? The firm’s maximization problem becomes 


Maximize 

n = P ] Q ] + P 2 Q 2 -b{Q t +Q 2 )~cK 

Qi'Qi.K 


subject to 

Qa<K 


Qi<K 

where 

P\ = P l {Q 1) 


Pi = P\Q2) 

and 

QuQi.K>0 


In view that the total revenue for Q,, 

= P%Q,)Q> 

is a function of Q, alone, wc can simplify the statement of the problem to 

Maximize t + R 2 ( £ 2 ) - KQ\ + Q 2 ) ~ (-'K 

subject to £1 < K 

Qi<K 

and £?!,£?:,* >0 

Note that both constraints are linear: thus the constraint qualification is satisfied and the 
Kuhn-Tucker conditions are necessary. 

The Lagrangian function is 

2 = *,(£,) + R 2 (Q 2 ) - b{ Qi + Q 2 ) - cK+k^K - g,) + k 2 {K - Q 2 ) 

and the Kuhn-Tucker conditions arc 


Zi 

= MRi 

-b- 

* A| 

<o 

Q 1 

> 

0 

Q)Z\ 

= 0 


= MR 2 

-b- 

* *2 

< 0 

Qi 

y> 

0 

Qih 

= 0 

U 

— —c + X) + 

A? 

< 0 

K 

> 

0 

KZ k ~- 

= 0 


= K- 

Q\> 

0 


■M 

> 

0 

= 

= 0 

Za 2 

= K- 

Qi> 

0 


h 

> 

0 

>.2 2;., = 

= 0 


where MR ( is the marginal revenue of Q, (/ = 1, 2). 

The solution procedure again entails trial and error. Let us first assume that Q ts 
K > 0. Then, by complementary slackness, we have 

MR| -b-k] =0 

MR 2 — b — X 2 =0 (13.26) 

— C -|- A 1 -|- At = 0 (k| = c — Aj) 
which can be condensed into two equations after eliminating M: 


MR] = b + c — k 2 
MR 2 =b + ), 2 


(13.26') 


Then we proceed in two steps. 
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FIGURE 13.5 



Step 1: Since the off-peak market is a secondary market, its marginal-revenue function 
(MR 2 ) can be expected to lie below that of the primary market (MR,) as illustrated in 
Fig. 13.5. Moreover, the capacity constraint is more likely to be nonbinding in the 
secondary market so that k 2 is more likely to be zero. So we try k 2 = 0. Then (13.26') 
becomes 


MR, =/) + £• 
MR 2 = b 


(13.26") 


The fact that the primary market absorbs the entire capacity cost c implies that Q\ — K. 
However, we still need to check whether the constraint Q 2 £ K is satisfied. If so, we have 
found a valid solution. Figure 13.5(a) illustrates the case where Q\ = K and Q 2 < K in 
the solution. The MR, curve intersects the b + c line at point E\, and the MR 2 curve inter¬ 
sects the b line at point E 2 . 

What if the previous trial solution entails Q 2 > K , as would occur if the MR 2 curve is 
very close to MR,, so as to intersect the b line at an output laTgcr than K! Then, of course, 
the second constraint is violated, and we musl reject the assumption of a 2 = 0, and proceed 
to the next step. 

Step 2: Now let us assume both Lagrange multipliers to be positive, and thus 
Q { = Q 2 = K. Then, unable to eliminate any variables from (13.26), we have 

MR, = b + a, 

MR 2 =6 + a 2 (13.26"') 

C = 3.1 + A 2 

This case is illustrated in Fig. 13.5(b), where points E\ and E 2 satisfy the first two equa¬ 
tions in (13.26"'). From the third equation, wc see that the capacity costc is the sum of the 
two Lagrange multipliers. This means A, and a 2 represent the portions of the capacity cost 
borne respectively by the two markets. 
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Example 2 Suppose the average-revenue function during peak hours is 

Pi =22-10“ 5 Q 1 

and that during off-peak hour* it is 

P 2 = 18-1CT s Q2 

To produce a unit of output per half-day requires a unit of capacity costing 8 cents per day. 
The cost of a unit of capacity is the same whether it is used at peak times only, or off-peak 
also. In addition to the costs of capacity, it costs 6 cents in operating costs (labor and fuel) 
to produce 1 unit per half-day (both day and evening). 

If we assume that the capacity constraint is nonbinding in the secondary market 
(A 2 = 0), then the given Kuhn-Tucker conditions become 

= c = 8 

22-2x10 5 Qi =b + c =14 
18-2x10 5 Q 2 =b =6 

MR MC 

Solving this system gives us 

Qi =400,000 
Q 2 = 600,000 

which violates the assumption that the second constraint is nonbinding because Q 2 > 
Qi =K. 

Therefore, let us assume that both constraints are binding. Then Qi = Q 2 = Q and the 
Kuhn-Tucker conditions become 

Ai + a 2 = 8 

22-2x 10 S Q = 6 + Ai 
18-2 x 10 5 0 = 6 + /.2 
which yield the following solution 

Qi = Q 2 = K = 500,000 
Ai =6 A 2 = 2 

P\= 17 P 2 = 13 

Since the capacity constraint is binding in both markets, the primary market pays A, = 6 of 
the capacity cost and the secondary market pays A 2 = 2. 


EXERCISE 13.3 

1. Suppose in Example 2 a unit of capacity costs oniy 3 cents per day. 

(o) What would be the profit-maximizing peak and off-peak prices and quantities? 

(b) What would be the values of the Lagrange multipliers? What interpretation do you 
put on their values? 

2. A consumer lives on an island where she produces two goods, xand y, according to the 
production possibility frontier x 2 + y 2 < 200, and she consumes all the goods herself. 
Her utility function is 

U = xy i 
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The consumer also faces an environmental constraint on her total output of both 
goods. The environmental constraint is given by x + y < 20. 

(a) Write out the Kuhn-Tucker first-order conditions. 

(b) Find the consumer's optimal x and y. Identify which constraints are binding. 

3. An electric company is setting up a power plant in a foreign country, and it has to plan its 
capacity. The peak-period demand for power is given by Pi = 400 - Qi and the off-peak 
demand is given by P 2 = 380 - Q 2 . The variable cost is 20 per unit (paid in both mar¬ 
kets) and capacity costs 10 per unit which is only paid once and is used in both periods, 
( 0 ) Write out the Lagrangian and Kuhn-Tucker conditions for this problem. 

(b) Find the optimal outputs and capacity for this problem. 

(c) How much of the capacity is paid for by each market (i.e., what are the values of /.1 
and k 2 )? 

(d ) Now suppose capacity cost is 30 cents per unit (paid only once). Find quantities, 
capacity, and how much of the capacity is paid for by each market (i.e., m and/?). 

13.4 Sufficiency Theorems in Nonlinear Programming _ 

In the previous sections, we have introduced the Kuhn-Tucker conditions and illustrated 
their applications as necessary conditions in optimization problems with inequality con¬ 
straints. Under certain circumstances, the Kuhn-Tucker conditions can also be taken as 
sufficient conditions, 

The Kuhn-Tucker Sufficiency Theorem: Concave Programming 

In classical optimization problems, the sufficient conditions for maximum and minimum 
arc traditionally expressed in terms of the signs of second-order derivatives or differentials. 
As we have shown in Sec. 11.5, however, these second-order conditions arc closely related 
to the concepts of concavity and convexity of the objective function. Here, in nonlinear 
programming, the sufficient conditions can also be stated directly in terms of concavity and 
convexity. And, in fact, these concepts will be applied not only to the objective function 
f(x) but to the constraint functions g'{x) as well. 

Tor the maximization problem, Kuhn and Tucker offer the following statement of suffi¬ 
cient conditions (sufficiency theorem): 

Given the nonlinear programming problem 

Maximize n = f(x) 

subject to g'OO < r, (1 = 1,2,..., m) 

and x > 0 

if the following conditions are satisfied: 

{«) the objective function f(x) is differentiable and concave in the nonnegativc orlhant 
(b) each constraint function y'(x) is differentiable and convex in the nonnegativc orthant 

(e) the point x* satisfies the Kuhn-Tucker maximum conditions 

then A'* gives a global maximum of jT = f(x). 

Note that, in this theorem, the constraint qualification is nowhere mentioned. This is 
because we have already assumed, in condition (c), that the Kuhn-Tuckcr conditions are 
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satisfied at x* and. consequently, the question of'the constraint qualification is no longer 
an issue. 

As it stands, the above theorem indicates that conditions (a), ( b). and (c) are sufficient to 
establish x* to be an optimal solution. Looking at it differently, however, we may also in¬ 
terpret it to mean that given (a) and ( b), then the Kuhn-Tucker maximum conditions are 
sufficient for a maximum. In the preceding section, we learned that the Kuhn-Tucker con¬ 
ditions, though not necessary per se, become necessary when the constraint qualification 
is satisfied. Combining this information with the sufficiency theorem, we may now slate 
that if the constraint qualification is satisfied and if conditions fa) and (h) arc realized, then 
the Kuhn-Tuekcr maximum conditions will be necessary-and-sufficient for a maximum. 
This would be the case, for instance, when all the constraints are linear inequalities, which 
is sufficient for satisfying the constraint qualification. 

The maximization problem dealt with in the sufficiency theorem above is often referred to 
as concave programming. This name arises because Kuhn and Tucker adopt the > inequality 
instead ofihe < inequality in every constraint, so that condition (b) would require the g‘(x) 
functions to be all concave, like the /(x) function. But we have modified the formulation in 
order to convey the idea that in a maximization problem, a constraint is imposed to “rein in" 
(hence, <) the attempt to ascend to higher points on the objective function. Though different 
in form, the two formulations arc equivalent in substance. For brevity, we omit the proof. 

As stated above, the sufficiency theorem deals only with maximization problems. Bui 
adaptation to minimization problems is by no means difficult. Aside from the appropriale 
changes in the theorem to reflect the reversal of the problem itself, all we have to do is to 
interchange the two words concave and convex in conditions (a) anti (b) and to use the 
Kuhn-Tucker minimum conditions in condition (t ). (See Exercise 114-1.) 

The ArroW’Enthoven Sufficiency Theorem: 

Quasiconcave Programming 

To apply the Kuhn-Tucker sufficiency theorem, certain concavity-convexity specifications 
must be met. These constitute quite stringent requirements. In another sufficiency theorem 
the Arrow-Hnthoven sufficiency theorem 5 —these specifications arc relaxed to the extent of 
requiring only quasiconcavify and quasiconvexity in the objective and constraint functions. 
With the requirements thus weakened, the scope of applicability of the sufficient conditions 
is correspondingly widened. 

In the original formulation of the Arrow-Enthoven paper, with a maximization problem 
and with constraints in the > form, the f(x) and g'Or) functions must uniformly be quasi¬ 
concave in order for their theorem to be applicable. This gives rise to the name quasiconcave 
programming. In our discussion here, however, wc shall again use the < inequality in the 
constraints of a maximization problem and the > inequality in the minimization problem. 

The theorem is as follows: 

Given the nonlinear programming problem 

Maximize n — f(x) 

subject to g'(x)<fi (i = 1,2,..., m) 

and x > 0 

* Kenneth |. Arrow and Alain C. Enthoven, "Quasi-concave Programming," Econometrica, October, 
1961, pp. 779-800. 
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if the following conditions are satisfied: 

(a) the objective function f(x) is differentiable and quasiconcave in the nonnegative 
orthant 

(ft) each constraint function g‘(x) is differentiable and quasicunvex in the nonnegalive 
orthant 

(c) the point x* satisfies the Kuhn-Tucker maximum conditions 

( d ) any one of the following is satisfied: 

(d-i) fj(x*) < 0 for at least one variable.V; 

(d-ii) f(x*) > 0 for some variable x, that can take on a positive value without 
violating the constraints 

(d-iii) the n derivatives fj(x*) are not all zero, and the function fix) is twice 
differentiable in the neighborhood of a * [i,e.. all the second-order partial 
derivatives of f(x) exist at x*J 
(d-i v} the function f(x) is concave 

thenx’ gives a global maximum ol 7r = f(x). 

Since the proof of this theorem is quite lengthy, we shall omit it here. However, we do 
want to call your attention to a few' important features of this theorem. For one thing, while 
Arrow and Enthoven have succeeded in weakening the concavity-convexity specifications 
to their quasiconcavity-quasiconvexity counterparts, they find it necessary to append a new 
requirement, (d). Mote, though, that only one of the four alternatives listed under id) is 
required to form a complete set of sufficient conditions. In effect, therefore, the above 
theorem contains as many as (bur different sets of sufficient conditions for a maximum. 
In the case of (d-iv), with f(x) concave, it would apear that the Arrow-Enthoven suffi¬ 
ciency theorem becomes identical with the Kuhn-Tuckcr sufficiency theorem. But this is 
not true. Inasmuch as Arrow and Enthoven only require the constraint functions g‘(x) to bo 
quiisiconvex, their sufficient conditions are still weaker. 

As staled, the theorem lumps together the conditions (a) through id) as a set of sufficient 
conditions. But it is also possible to interpret it to mean that, when (a). ( b ), and (d) arc sat¬ 
isfied, then the Kuhn-Tuckcr maximum conditions become sufficient conditions for a max¬ 
imum. Furthermore, if the constraint qualification is also satisfied, then the Kuhn-Tucker 
conditions will become necessary-and-sufficicnt for a maximum. 

Like the Kuhn-Tucker theorem, the Arrow-Enthoven theorem can be adapted with ease 
to the minimization framework. Aside from the obvious changes that arc needed to reverse 
the direction of optimization, we simply have to interchange the words quasiconcave and 
quasiconvex in conditions (a) and (ft), replace the Kuhn-Tucker maximum conditions by 
the minimum conditions, reverse the inequalities in (d-i) and (d-ii). and change the word 
concave to convex in (d-iv). 


A Constraint-Qualification Test 

It was mentioned in Sec. 13.2 that if all constraint functions are linear, then the constraint 
qualification is satisfied. In case theg'(.v) functions are nonlinear, the following test offered 
by Arrow and Enthoven may prove useful in determining whether the constraint qualifica¬ 
tion is satisfied: 


1 
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For a maximization problem, if 

(«) every constraint function g‘(x) is differentiable and quasiconvex 

(b) there exists a point x n in the nonnegative orthant such that all the constraints are sat¬ 
isfied as strict inequalities at x° 

(c) one of the following is true: 

(c-i) every g' (x) function is convex 

(c-/;) the partial derivatives of every g'(x) are not all zero when evaluated at every 
point x in the feasible region 

then the constraint qualification is satisfied. 

Again, this test can be adapted to the minimization problem with ease. To do so. just change 
the word quasiconvex to quasiconcave in condition («), and change the word convex to 
concave in (c-i). 


EXERCISE 13.4 

1. Given: Minimize C - F(x) 

subject to C'(x) > r, (i = 1,2 
and x > 0 

(a) Convert it into a maximization problem. 

(b) What in the present problem are the equivalents of the f and g' functions in the 
Kuhn-Tucker sufficiency theorem? 

(c) Hence, what concavity-convexity conditions should be placed on the f and G' 
functions to make the sufficient conditions for a maximum applicable here? 

( d ) On the basis of the above, how would you state the Kuhn-Tucker sufficient condi¬ 
tions for a minimum? 

2. Is the Kuhn-Tucker sufficiency theorem applicable to: 

(g) Maximize n = x i 

subject to + ~ 1 

and *i, *2 2 0 

Ob) Minimize C = (jq - 3) 2 + (x 2 - A) 1 
subject to xi + X 2 > 4 

and xi, xj > 0 

(c) Minimize C = 2 xi-i-X2 

subject to x, 2 - 4xi + xz > 0 
and *i, *2 > 0 

3. Which of the following functions are mathematically acceptable as the objective 
function of a maximization problem which qualifies for the application of the Arrow- 
Enthoven sufficiency theorem? 

(a) f(x) = x 3 - 2x 
lb) f(x 1 ,x 2 ) = 6 x 1 - 9 x 2 

(c) f(x,, x 2 ) = x 2 - In xi (Wore: See Exercise 12.4-4.) 
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A. Is the Arrow-Enthoven constraint qualification satisfied, given that the constraints of a 
maximization problem are: 

(a) x\ + (*2 - 5) 2 < 4 and 5*i + m < 10 

(fa) xi + k 2 < 8 and -*i x 2 < -8 (Note: -x\x 2 is not convex.) 

13.5 Maximum-Value Functions and the Envelope Theorem 1 

A maxi mum-value function is an objective function where the choice variables have been 
assigned their optimal values. These optimal values of the choice variables are, in turn, 
functions of the exogenous variables and parameters of the problem. Once the optimal val¬ 
ues of the choice variables have been substituted into the original objective function, the 
function indirectly becomes a function of the parameters only (through the parameters' in¬ 
fluence on the optima/ values of the choice variables). Thus the maximum-value function is 
also referred to as the indirect objective function. 

The Envelope Theorem for Unconstrained Optimization 

What is the significance of the indirect objective function? Consider that in any optimiza¬ 
tion problem the direct objective function is maximized (or minimized) for a given set of 
parameters. The indirect objective function traces out all the maximum values of the ob¬ 
jective function as these parameters vary. Hence the indirect objective function is an 
“envelope” of the set of optimized objective functions generated by varying the parameters 
of the model. For most students of economics the first illustration of this notion of an 
envelope arises in the comparison of short-run and long-run cost curves. Students are typ¬ 
ically taught that the long-run average cost curve is an envelope of all the short-run average 
cost curves (what parameter is varying along the envelope in this case?). A formal deriva¬ 
tion of this concept is one of the exercises we will be doing in this section. 

To illustrate, consider the following unconstrained maximization problem with two 
choice variables x and y and one parameter (ft: 


Maximize U = fix. y, <ft) 

(13.27) 

The first-order necessary condition is 


fAx.y. 0) = /v(*,}',0) = O 

(13.28) 


If second-order conditions are met, these two equations implicitly define the solutions 

x'=x*(<f>) /=/(0) (13.29) 

If wc substitute these solutions into the objective function, we obtain a new function 

K{0) = /U*(0), >*(0), (p ) (13.30) 

where this function is the value off when the values of .v and y are those that maximize 
/'{*, y\ 0). Therefore, V{0) is the maximum-value function (or indirect objective function). 

f This section of the chapter presents an overview of the envelope theorem. A richer treatment of this 
topic can be found in Chap. 7 of The Structure of Economics; A Mathematical Analysis (3rd ed.) by 
Eugene Silberberg and Wing Suen (McGraw-Hill, 2001) on which parts of this section are based. 
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If we differentiate V with respect to <p, its only argument, we get 

dV _ fa* , c )y* 

dij> ~ + ^'1# + & 


(13.31) 


However, from the first-order conditions we know' f, = f } = 0. Therefore, the first two 
terms disappear and the result becomes 


dV_ 

2 0 


U 


(13.31') 


This result says that, at the optimum, as 0 varies, with x* and allowed to adjust, the 
derivative dV/d<j> gives the same result as if, y* and f are treated as constants. Note that 
enters the maximum-value function (13.30) in three places: one direct and two indirect 
(through x* and;'"). Equation (13.31') shows that, at the optimum, only the direct effect of 
d> on the objective function matters. This is the essence of the envelope theorem. The enve¬ 
lope theorem says that only the direct effects of a change in an exogenous variable need be 
considered, even though the exogenous variable may also enter the maximum-value func¬ 
tion indirectly as part of the solution to the endogenous choice variables. 


The Profit Function 

Let us now apply the notion of the maximum-value function to derive the profit i unction of 
a competitive firm. Consider the case where a firm uses two inputs: capital K and labor 
The profit function is 

7 r = Pf{K, L) - wl - rK (13.32) 

where P is the output price and vv and r are the wage rate and rental rate, respectively. 

The first-order conditions arc 


n L = Pf L {K,L)- h- = 0 
=PMK,L)-r= 0 

which respectively define the input-demand equations 

L* = L*(w,r, P) 

K* = K*(w. r,P) 

Substituting the solutions A'* and L‘ into the objective function gives us 

n\w,r, P) = PJ(K\ IP) - wL’-rT 


(13.33) 


03.34) 

(13.35) 


where n*(w\ r, P) is xhc profit function (an indirect objective function). The profit function 
gives the maximum profit as a function of the exogenous variables w, r, and P. 

Now consider the effect of a change in ;v on the firm's profits. If we differentiate 
the original profit function (13.32) with respect to w, holding all other variables constant, 
we get 

On 

* = - L (13 - 36) 

However, this result does not take into account the profit-maximizing firm's ability to make 
a substitution of capital for labor and adjust the level of output in accordance with profit- 
maximizing behavior. 
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In contrast, since ;r”(w, r, P) is the maximum value ofproiilsfor any values of w. r. and 
P, changes in 71 * from a change in w takes all capital-for-labor substitutions into accounl. 
To evaluate a change in the maximum profit function caused by a change in w. wc differ¬ 
entiate ^*(vv, r, P) with respect to vv to obtain 
tin* 


9u' 


HL f 

tlw 


(P/k-'^-L* 


(I3.i7) 


From the first-order conditions (13.33). the two terms in parentheses are equal to zero. 
Therefore, the equation becomes 


cbT 


* 


Dw 


~L*( W ,r, P) 


(13.38) 


This result says that, at the profit-maximizing position, a change in profits with respect to a 
change in the wage rate is the same whether or not the factors arc held constant or allowed 
to vary as the factor price changes. In this case, (13.38) shows that the derivative of the 
profit function with respect to w is the negative of the factor demand function L*( u\ r, P). 
Following the preceding procedure, we can also show the additional comparative-static 
results: 


and 


3 J T*(w,r. P) 
Tr 

P) 

HP 


= P) 

= 


(13.39) 

(13.40) 


Equations (13.38). (13.39). and (13.40) are collectively known as Hotelling's lemma. Wc 
have obtained these comparative-static derivatives from the profit function by allowing K* 
and L* to adjust to any parameter change. But it is easy to see that the same results will 
emerge if we differentiate the profit function (13.35) with respect to each parameter while 
holding K* and/-* constant. Thus Hotelling’s lemma is simply another manifestation ofthe 
envelope theorem that we encountered earlier in (13.31'). 


Reciprocity Condition 

Consider again our two-variable unconstrained maximization problem 

Maximize U = f(x,y,tp) [from (13.27)] 

where ,r and y are the choice variables and 0 is a parameter. The first-oidcv conditions are 
f x = f ; = 0, which imply x* =x’(<p) andy’ = 

We arc interested in the comparative statics regarding the directions of change in x'{<p) and 
>•’(<£) as <t> changes and the effects on the value function. The maximum-value function is 

By definition, V((p) gives the maximum value of/ for any given 0. 

Now consider a new function that depicts the difference between the actual value and the 
maximum value of U: 

ft(„v,y,0) = f{x,y,$)~ V(0) (13.42) 

This new' function ft has a maximum value of zero when .r - x* and y = y*; for any 
.v ^ x*. y ± y‘ wc have / 5 V. In this framework ft(x, y, tp) can be considered a function 
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of three independent variables. x,y\ and 0. The maximum of ftlx.y, 0) = J\x, v. 0) - 
V(<t>) can be determined by the first- and second-order conditions. 

The first-order conditions are 


and 


Q x (x,y,<p) = j\ =0 
y, 4>) = f y ~ 0 

y\ 0 ) = fb — Kp = 0 


(13.43) 

(13.44) 


We can see that the iirst-ordcr conditions of our new function ft in (13.43) are nothing but 
the original maximum conditions for f(x,y,<j>) in (13.28). whereas the condition in 
(13.44) really restates the envelope theorem (13.31'). These first-order conditions hold 
whenever x = x ‘(0) and y = y"(0). The second-order sufiicient conditions are satisfied if 
the Hessian of ft 


!! 


is characterized by 


< 0 


JXX 

fyx 

j<t>x 


fxy j\<f> 

Jyy ./ V(l> 

j<$>\ j 




H <0 


In deriving the Hessian above, we listed the variables in the order (x,y. (j>) and, conse¬ 
quently, the lirsl entry in the second-order conditions, (f2 u =) f lx <0 relates to the vari¬ 
able x. Had we adopted an alternative listing order, then the first entry could have been 
ft ( , v . = f n < 0, or 


n& = U-V*<0 (13.45) 

It turns out that (13.45) can lead us to a result that provides a quick way to reach a 
comparalivc-statie conclusion. First, we know from (13.41) that 

W) = /*(**(0). V*(0),0) 

Differentiating both sides with respect to 0 yields 

v w> = — + t'h ^ (13.46) 

Using (13.45) and Young’s theorem, we can write 

^ f) * 

— ~ + ./v0TT ^ (13.47) 

Ofp ' (}(p 

Suppose that 0 enters only in the first-order condition for a, such that j y4> = 0. Then 
(13.47) reduces to 

f*4r> 0 (13.48) 

d<p 

which implies that f x $ and 3 a 7# will have the same sign. Thus, whenever we see the 
parameter 0 appearing only in the first-order condition relating to x t and once we have 
determined the sign of the derivative f xr p from the objective function U = f{x, y, 0). 
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wc can immediately tell the sign of the comparative-static derivative dx‘ji)0 without 
further ado. 

For example, in the profit-maximization model: 

,7 - Pf{K.L)-wL-rK 


where the first-order conditions are 


7t£. =: Pfl - w = 0 
: = Pfn - r = 0 

the exogenous variable vv enters only the first-order condition Pi) - w = 0, with 


tin’ 


Therefore, by (13.48). we can conclude that L)L~jdw will also be negative. 

Further, if we combine the envelope theorem with Young's theorem, we can derive a re¬ 
lation known as the reciprocity condition: i)L"/dr — From the indirect profit 

function r< P ). Hotelling's lemma gives us 

<. = '-y~ = P) 

d\v 

Htt* 

< = — = -K*(w,r. P) 

dr 

Differentiating again and applying Young’ theorem, we have 


or 


3 AT* 


dL‘ d K* 


i)r <)w 


(13.49) 


This result is referred to as the reciprocity condition because it shows the symmetry 
between the comparative-static cross effect produced by the price of one input on the 
demand for the "other” input. Specifically, in the comparative-static sense, the effect of r 
(the rental rate for capital A’) on the optimal demand for labor /. is the same as the effect of 
w (the wage rate for labor I.) on the optimal demand for capital K. 


The Envelope Theorem for Constrained Optimization 

The envelope theorem can also be derived for the case of constrained optimization. Again 
we will have an objective function (U), two choice variables (x and >■) and one parameter 
(<p): except now we introduce the following constraint; 

X(x,y-<l>) = 0 


The problem becomes: 

Maximize U — f(x,y:<f>) 
subject to "(x. y: <(>) = 0 
The l.agrangian for this problem is 

Z - f(x, v: 0] + a[ 0 - g(x, y\0)] 


(13.50) 


(13.51) 
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with first-order conditions 


Z x = fx- Agt = 0 

Z y = fy ~ >‘dv = 0 

2k = -g(-v. v; 0) = 0 

Solving this system of equations gives us 


or = jt*c0) y = y\ 4 >) a = r(0) 

Substituting the solutions into the objective function, we get 

ir = /(imvm«=^) 03.52) 

where L(0) is the indirect objective function, a maximum-value function. This is the max¬ 
imum value ofy for any 0 and x, ’s that satisfy the constraint. 

How does V(<p) change as 0 changes'? first, we differentiate V with respect to 0: 


dV _ i)x* ty* 


(13.53) 


In this case, however, (13.53) will not simplify to dV /dtp = /0 since in constrained opti¬ 
mization, it is not necessary to have/, = f y =0 (see Table 12.1). But if we substitute the 
solutions to,randy into the constraint (producing an identity), we get 

g(**(0),/(0).0) = 0 


and differentiating this with respect to <p yields 


dx~ 3 y" 

** +ft £ + *=° (13 ' 54) 


If wo multiply (13.54) by A, combine the result with (13.53), and rearrange terms, we get 


dV 

J0 


(/a - A &) 


i)(j) 


+ C/v 



+ ftp “ kg# — Z<$, 


(13.55) 


where Z% is the partial derivative of the Lagrangian function with respect to 0, holding all 
other variables constant. This result is in the same spirit as (13.31), and by virtue of the 
first-order conditions, it reduces to 


Z = z 


03.55') 


which represents the envelope theorem in the framework of constrained optimization. Note, 
however, in the present ease, the Lagrangian function replaces the objective function in de¬ 
riving the indirect objective function. 

While the results in (13.55) nicely parallel the unconstrained case, it is important to note 
that some of the comparative-static results depend critically on whether the parameters 
enter only the objective function, or only the constraints, or enter both. If a parameter en¬ 
ters only in the objective function, then the comparative-static results are the same as for 
the unconstrained case. However, if the parameter enters the constraint, the relation 


- Am 


will no longer hold. 
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Interpretation of the Lagrange Multiplier 

In the consumer choice problem in Chap. 12 we derived the result thaithe Lagrange multi¬ 
plier A represented the change in the value of the Lagrange function when the consumer's 
budget changed. We interpreted A as the marginal utility of income. Now let us derive a 
more general interpretation of the Lagrange multiplier with the assistance of the envelope 
theorem. Consider the problem 

Maximize U = f{x, y) 

subject to £{.*,>’) = c 

where c is a constant. The Lagrangian for this problem is 

7, = f(x,y) + k[c -$(*.>')] (13.56) 


The first-order conditions are 

Z, =M-t,y)-XgJx,y) = Q 
Z v = Mx r v)-).g :i :ix ; y) = Q 
Z k = c-g{x,y) = 0 

From the first two equations in (13.57). we get 

& gy 


(13.57) 


(13.58) 


which gives us the condition that the slope of the level curve (indifference curve) of the 
objective function must equal the slope of the constraint at the optimum. 

Equations (13.57) implicitly define the solutions 

X *=x*[c) / =y*(c) a“ = A"(c) (13.59) 


Substituting (13.59) back into the Lagrangian yields the maximum-value function. 

V(c) = Z*(c) = /Xx\c),y*(c))+k r (c)[e - g(x^c), /(e))] (13.60) 


Dilferentiating with respect to c yields 


dc dc dc dc nc 


Ox* 3v* dc 


By rearranging wc get 
dZ* 


~T = [/, - + f/v - Z' gv ]?£ + [c - g{x\ /)]^ 4 A* 

dc dc ' ' dc dc 


By (13.57), the three terms in brackets are all equal to zero. Therefore this expression 
simplifies to 


dV - dr -- 
~dc ~ ~dc ~ A 


(13.61) 


which shows that the optimal value A* measures the rale of change of the maximum 
value of the objective function when c changes, and is for this reason referred to as the 
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“shadow price 1 ' of c. Note (hat, in this case, c enters the problem only through the 
constraint; it is not an argument of the original objective function. 

13.6 Duality and the Envelope Theorem __ 

A consumer’s expenditure function and his or her indirect utility function exemplify the 
minimum- and maximum-value functions for dual prohlems ] An expenditure function 
specilies the minimum expenditure required to obtain a fixed level of utility given the util¬ 
ity function and the prices of consumption goods. An indirect utility function specifies the 
maximum utility that can be obtained given prices, income, and the utility function. 

The Primal Problem 

Let U(x,y) be a utility function where x and rare consumption goods. The consumer has 
a budgets and faces market prices P x and P v for goodsx andy, respectively. This problem 
will be considered the primal problem: 

Maximize U = U(x, y) 

u- „ [Primal] (13.62) 

subject to P x x + P t .y = B ' ' 

For this problem, wc have the familiar Lagrangian 

Z = U{x. y ) + a( H - P K x - P v y) 

The first-order conditions are 

Z, = U x - kP x = 0 

Z y = U y -\Py = 0 (13.63) 

Z, = B- P xX - P y y = 0 

This system of equations implicitly defines a solution forx'", y m , and as a function of 
the exogenous variables B, P x . Py. 


x”=x a {P t ,P y ,B) 
y m = >■"'(/],, P y , B) 

A m — k m (P x , P,, B) 

The solutions x m and y m are the consumer’s ordinary demand functions, sometimes called 
the “Marshallian” demand functions, henee the superscript m. 

Substituting the solutions x m and y"' into the utility function yields 

u* = U\x m {P x , P y , B),v m (P x , Py, B)) = V{P X , P y . B) ( 13 . 64 ) 

where V is the indirect utility function—a maximum-value function showing the maximum 
attainable utility in problem (13.62). We shall return to this function later. 


1 Duality in economic theory is the relationship between two constrained optimization problems. If 
one of the problems requires constrained maximization, the other problem will require constrained 
minimization. The structure and solution of either problem can provide information about the 
structure and solution of the other problem. 
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The Dual Problem 

Now consider a related dual problem for the consumer with the objective of minimizing the 
expenditure on x and y while maintaining a fixed utility level l'~ derived from (13.64) of 
the primal problem: 


Minimize E = P x x + P y y 

subjeetto C/(x,y) = U* 


[DualJ 


(13.65) 


Its Lagrangian is 

Z a = P Jl x + P,y + pL[U r -U(x,y)] 
and the first-order conditions are 

Z? = P X - pU ( = 0 

Zi = P y -pL\. = 0 (13.66) 

Z d x = U* - U(x, y) = 0 


This system, of equations implicitly defines a set of solution values to be labeled x\ y\ 
and k h : 


X^X^Pt.Pr.U*) 

/=AP*>P y ,V 4 ) 

Hcrcx' 1 andy 4 are the compensated (“real income" held constant) demand functions. They 
arc commonly referred to as "Hicksian” demand functions, hcncc the h superscript. 
Substituting x h and y h into the objective function of the dual problem yields 

PxAPx, Py. V) + P>/(Pf Py> t/‘) = E(P Xt P v . V) (13.67) 

where E is the expenditure function —a minimum-value function showing the mini mum 
expenditure needed to attain the utility level O'*. 


Duality 

If we take the first two equations in (13.63) and in (13.64), and eliminate the Lagrange 
multipliers, we can write 


Px_ _ Ux 
Py~Py 


(13.68) 


This is the tangency condition in which the consumer chooses the optimal bundle where the 
slope of the indifference curve equals the slope of the budget constraint. The tangency con¬ 
dition is identical for both problems. Thus, when the target level of utility in the minimiza¬ 
tion problem is set equal to the value U’ obtained from the maximization problem, we get 


X m {P x ,Py,B)=X h (P X ,Pr,V') 

y m (P 1 ,P,.,.fi) = vVv,/\.fn 


(13.69) 


i.e., the solutions to both the maximization problem and the minimization problem produce 
identical values for x and y. However, the solutions are functions of different exogenous 
variables, so comparative-static exercises wilt generally produce different results. 


i 
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The fact that the solution values for x and y in the primal and dual problems are deter¬ 
mined by the tangeney point of the same indifference curve and budget-constraint line 
means that the minimized expenditure in the dual problem is equal to the given budget B of 
the primal problem: 


E{P x ,P y ,U*) = B (1370) 


This result is parallel to the result in (13.64), which reveals that the maximized value of util¬ 
ity Fin the primal problem is equal to the given target level of utility U* in the dual problem. 

While the solution values of.t and y arc identical in the two problems, the same cannot 
be said about the Lagrange multipliers. From the first equation in (13.63) and in (13.66), we 
can calculate A = U x jP y , but /i = P x /U x . Thus, the solution values of A and n are recip¬ 
rocal 10 each other: 



or 


i m 
A 


a 1 


(13.71) 


Roy's Identity 

One application of the envelope theorem is the derivation of Roy’s identity. Roy’s identity 
states that the individual consumer’s Marshallian demand function is equal to negative of 
the ratio of two partial derivatives of the maximum-value function. 

Substituting the optimal values x ,fl , y m , and A" 1 into the Lagrangian of (13.62) gives us 


F(A, P v , B) = U(x m ,y m ) + > m (B - p x x m - P y y‘" ) (13.72) 

When we differentiate (13.72) with respect to P x we find 

dV bx m Bv m 

— - (b\ - A m P l ) -+ (U v - k m P v ) — 

iIP, ’dP x ' ’ BP X 

aim 

+ {B- P x x m - p y y m )— - k"x m 

a r x 

At the optimum, the first-order conditions (13.63) enable us to simplify this to 

,r =-w- 


ft A 


Next, differentiate the value function w'ith respeci to B to got 

BV i) x m 9r'" 


{ B-P xX "-P.y< 


3A* 

Jb 


+ A' 


Again, at the optimum, (13.63) enables us to simplify this to 

dF 


dB 


— K 


By taking the ratio of these tw'o partial derivatives, we find that 




(13.73) 


dV/BB 

This result, known as Roy’s identity, shows that the Marshallian demand for commodity x 
is the negative of the ratio of two partial derivatives of the maximum-value function F with 
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respect to P x and B, respectively. In view of the symmetry between ,v and y in the problem, 
a result similar to (13.73) can also be written for the Marshallian demand fory. Of 
course, this result could be arrived at directly by applying the envelope theorem. 


Shephard's Lemma 

In Sec. 13.5, we derived Hotelling’s lemma, which states that the partial derivatives of 
the maximum value of the profit function yields the firm's input-demand functions and the 
supply functions. A similar approach applied to the expenditure function yields Shephard’s 
lemma. 

Consider the consumer’s minimization problem (13.65). The Lagrangian is 
7 d = P s x + P,y + fi[U* - U(x, y)] 

From the first-order conditions, the following solutions are implicitly defined 

x h =x h (P t ,r y AP) 

/ = >'Vv, Pv U*) 

H k = li h (P x .P y ,W) 


Substituting these solutions into the Lagrangian yields the expenditure function: 


£(A, U*) = p x x* + py +ii n [u* - u(x h ,/)] 


Taking the partial derivatives of this function with respect to P x and P y and evaluating them 
at the optimum, we find that SE/dP, and SE/dP, represent the consumer’s Hicksian 
demands: 


£-«- +{U ' - U(x "' *'w, 4 '* 


dv h db u h 


( 13 . 74 ) 


and 

i)E 




„a>-‘ . -dii" 




( 13 . 74 ') 


Finally, differentiating E with respect to the constraint (/“ yields ji h , the marginal cost of 
the constraint 


&E 

W* 


= (P X -H*V X ) 


i)x‘ 


b 


dU .+W-W- aPf 




Jl 


dx 

= ( 0 )- 

\)U * 




( 13 . 74 ") 
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Together, the three partial derivatives (13.74), (13.74'), and (13.74") are referred to as 
Shephard's lemma. 

Example 1 Consider a consumer with the utility function J = xy , who faces a budget constraint of 8 

- and is given prices P K and P y . 

The choice problem is 

Maximize J = xy 
subject to P x x + P Y y = 8 
The Lagrangian for this problem is 


Z = - P x x - P y y ) 

The first-order conditions are 


Z, = y-}.P X = 0 
Zy = X-/,Py = 0 

4 = 8 - 4* • P y y = 0 

Solving the first-order conditions yields the following solutions: 

x <n _ ^ ytn _ & ■ iti _ & 

2 4 ' 2 Py K 2P x Py 

where x m and y m are the consumer's Marshallian demand functions. For the second-order 
condition, since the bordered Hessian is 


H 


0 1 
1 0 
-Px ~Py 



= 244 >o 


the solution does represent a maximum* 

We can now derive the indirect utility function for this problem by substituting x m and 
y m into the utility function: 


44, Py, B) = 


B 

24 


S 

24 


B 2 

4PxPy 


(13.75) 


where V denotes the maximized utility. Since V represents the maximized utility, we can set 
V = U* in (13.75) to get B 2 /4P x P y = J*, and then rearrange terms to express B as 


6 = (4P > P y U , )' i2 = 2P' :2 Py ! ' 2 U' ]/2 


Now, think of the consumer's dual problem of expenditure minimization. In the dual 
problem, the minimum-expenditure function £ should be equal to the given budget 
amount B of the primal problem, Therefore, we can immediately conclude from the pre¬ 
ceding equation that 


£(4, 4, IT) = B = 2P^ 2 pI i2 U~ ] ' 2 (13.76) 


* Note that the bordered Hessian is written here (and in Example 2 on page 440) with the borders in 
the third row and column, instead of in the first row and column as in (12.19). This is the result of 
listing the Lagrange multiplier as the last rather than the first variable as we did in previous chapters. 
Exercise 12.3-3 shows that the two alternative expressions for the bordered Hessian are transformable 
into each other by elementary row operations without affecting its value. However, when more than 
two choice variables appear in a problem, it is preferable to use the (12.19) format because that 
makes it easier to write out the bordered leading principal minors. 
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Example 2 


Let’s now use This example to verify Roy's identity (13.73) 

X « = J11*A 

BV/ 3 B 

Taking the relevant partial derivatives of V. we find 

BY B 2 

JF X ~ 4 P'-P y . 

BV B 

and Jb = TKF, 

The negative of the ratio of these two partials is 

BV ( B 2 \ 

Jp x U P 2 P,} B _ 

BV / B \ 2 P x ' 

&B \2KP>) 

Thus we find that Roy’s identity does hold. 


Now consider the dual problem of cost minimization given a fixed level of utility related to 
Example 1 . Letting tr denote the target level of utility, the problem is: 

Minimize P x x + P y y 

subject to xy=U’ 

The Lagrangian for the problem is 

Z d = P,xh Pyy + iiW-xy) 

The first-order conditions are 

Zi=P x - t <y = 0 
Z d y =Py-vx = 0 
Z« = U* -xy = 0 

Solving the system of equations for x, y, arid we get 

-=(-'£)* 


y 


h 


U 


h 


pxU*y 

Py ) 

Pjy_Y 

u* ) 


( 13 . 77 ) 


where x h and y h are the consumer's compensated (Hicksian) demand functions. Checking 
the second-order condition for a minimum, we find 

I o —m -y\ 


H 


-U 0 -x 


Ixyn < 0 


| - y -x 0| 

Thus the sufficient condition for a minimum is satisfied. 



Chapter 1 3 /• urthcr Tajiks in Optimization 441 


Substituting x h and y h into the original objective function gives us the minimum-value 
function, or expenditure function 


E = p„x h + P Y y h = P, 


P„U 


1/2 


= (P x l> y U')'< 2 + (P,P y U') 

_ 2 p 1 / 2 p 1 / 2^*!/2 


P* 
*\ 1/2 


P,U 

p. 


1/2 


( 13 . 76 ') 


Note that this result is identical with (13.76) in Example 1. The only difference lies in the 
process used to derive the result. Equation (13 .76 ) is obtained directly from an expenditure- 
minimization problem, whereas (13.76) is indirectly deduced, via the duality relationship, 
from a utility-maximization problem. 

We shall now use this example to test the validity of Shephard’s lemma (13.74), (13.74'), 
and (13.74"). Differentiating the expenditure function in (13.76') with respect to P x , P y , 
and U', respectively, and relating the resulting partial derivatives to (13.77), we find 


^(P<,P yi U-) P'/ 2 U *V 2 , 

cIP. p' x * 

BE(P*,Py,U‘) P^'V 1 -' 2 

3E{P X ,Py,U’) P^P'/' 2 „ 

lib* ■ U*V2 11 


Thus, Shephard's Lemma holds in this example. 


EXERCISE 13.6 

1. A consumer has the following utility function: U(x, y) = x(y+1), where x and y are 
quantities of two consumption goods whose prices are P x and P y , respectively. The 
consumer also has a budget of B. Therefore, the consumer's Lagrangian is 

x(y+1)-A(S- Pxx-Pyy) 

(a) From the first-order conditions find expressions for the demand functions. What 
kind of good is y? In particular what happens when P y > fl? 

(b) Verify that this is a maximum by checking the second-order conditions. By substi¬ 
tuting x* and y* into the utility function, find an expression for the indirect utility 
function 

U* = U(P„ Py,B) 

and derive an expression for the expenditure function 
£ = E(P„ P y , IT) 

(c) This problem could be recast as the following dual problem 
Minimize P*x+P y y 

subject to x(y + 1) = U* 

Find the values of x and y that solve this minimization problem and show that the 
values of x and y are equal to the partial derivatives of the expenditure function, 
BE[BP X and BE/tiPy, respectively. 
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13.7 Some Concluding Remarks 


in the present part of the book, we have covered the basic techniques of optimization. The 
somewhat arduous journey has taken us (1) from the case of a single choice variable to the 
more general n-variable ease, (2) from the polynomial objective function to the exponential 
and logarithmic, and (3) from the unconstrained to the constrained variety of extremum. 

Most of this discussion consists of the “classical' - methods of optimization, with difier- 
ential calculus as the mainstay, and derivatives of various orders as the primary tools. One 
weakness of the calculus approach to optimization is its essentially myopic nature. While 
the first- and second-order conditions in terms of derivatives or differentials can normally 
locale relative or local extrema without difficulty, additional information or further investi¬ 
gation is often required for identification of absolute or global extrema. Our detailed dis¬ 
cussion of concavity, convexity, quasiconeavily, and quasiconvexity is intended as a useful 
stepping-stone from the realm of relative extrema to that of absolute ones. 

A more serious limitation of the calculus approach is its inability to cope with con¬ 
straints in the inequality form. For this reason, the budget constraint in the utility- 
maximization model, for instance, is stated in the form that the total expenditure be exactly 
equal to (and not “less than or equal to") a specified sum. In other words, the limitation of 
the calculus approach makes it necessary to deny the consumer the option of saving part of 
ihe available funds. And, for the same reason, the classical approach does not allow us to 
specify explicitly that the choice variables must be nonnegativc as is appropriate in mosl 
economic analysis. 

Fortunately, we arc liberated from these limitations when we introduce the modem 
optimization technique known as nonlinear programming. Here we can openly admit in¬ 
equality constraints, including nonnegativity restrictions on the choice variables, into the 
problem. This obviously represents a giant step forward in the development of optimization 
methodology. 

Still, even in nonlinear programming, the analytical framework remains static. The 
problem and its solution relate only to the optimal state at one point of time and cannot ad¬ 
dress the question of how an optimizing agent should, under given circumstances, behave 
over a period of time. The latter question pertains to Ihe realm of dynamic optimization. 
which we arc unable to handle until we have learned the basics of dynamic analysis -the 
analysis of movements of variables over time. In fact, aside from its application to dynamic 
optimization, dynamic analysis is. in itself, an important branch of economic analysis. For 
this reason, we shall now turn our attention to the subject of dynamic analysis in Part 5. 







Chapter 


Economic Dynamics 
and Integral Calculus 


The term dynamics, as applied to economic analysis, has had different meanings at differ¬ 
ent times and for different economists.* In standard usage today, however, the term refers to 
the type of analysis in which the object is either to trace and study the specific time paths 
of the variables or to determine whether, given sufficient time, these variables will tend to 
converge to certain (equilibrium) values. This type of information is important because it 
fills a major gap that marred our study of statics and comparative statics. In the latter, we 
always make the arbitrary assumption that the process of economic adjustment inevitably 
leads to an equilibrium. In a dynamic analysis, the question of ''attainability" is to be 
squarely faced, rather than assumed away. 

One salient feature of dynamic analysis is the dating of the variables, which introduces 
the explicit consideration of time into the picture. This can be done in two ways: time can 
be considered either as a continuous variable or as a discrete variable. In the former case, 
something is happening to the variable at each point of time (such as in continuous interest 
compounding); whereas in the latter, the variable undergoes a change only once within a 
period of time (e.g., interest is added only at the end of every 6 months). One of these time 
concepts may be more appropriate than the other in certain contexts. 

We shall discuss first the continuous-time case, to which the mathematical techniques of 
integral calculus and differential equations are pertinent. Later, in Chaps. 17 and 18, we 
shall turn to the discrete-time ease, which utilizes the methods of difference equations. 

14.1 Dynamics and Integration __ 

In a static model, generally speaking, the problem is to find the values of the endogenous 
variables that satisfy some specified equilibrium condition(s). Applied to the context of 
optimization models, the task becomes one of finding the values of the choice variables 
that maximize (or minimize) a specific objective function with the first-order condi¬ 
tion serving as the equilibrium condition. In a dynamic model, by contrast, the problem 

f Fritz Machlup, "Statics and Dynamics: Kaleidoscopic Words," Southern Economic journal, October 
1959, pp, 91-110; reprinted in Machlup, Essays on Economic Semantics, Prentice-Hall, Inc., 

444 Englewood Cliffs, N.J., 1963, pp. 9-42. 
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involves instead the delineation of the time path of some variable, on the basis of a known 
pattern of change (say, a given instantaneous rate of change). 

An example should make this clear. Suppose that population size // is known to change 
oyer time at the rate 


dt 


= t 


- 1/2 


( 14 . 1 ) 


We then try to find what time path(s) of population H = H(t) can yield the rate of change 
in (14.1). 

You will recognize that, if we know the function H = //(/) to begin with, the derivative 
dHjdl can be found by differentiation. But in the problem now confronting us, the shoe is 
on the other foot: we arc called upon to uncover the primitive function from a given derived 
function, rather than the reverse. Mathematically, wc now need the exact opposite of the 


method of differentiation, or of differential calculus. 

The relevant method, known as integration, or integral calculus, will be studied in this 
chapter. For the time being, let us be content with the observation that the function 
//(f) - It 1/2 does indeed have a derivative of the form in (14.1), thus apparently qualify¬ 
ing as a solution to our problem. The trouble is that there also exist similar functions, such 
as H{t) = 2; 1 ' 2 + 15 or Hit ) = It 1/2 + 99 or, more generally. 

H{t) = 2r l/2 +c (c = an arbitrary constant) ( 14 . 2 ) 


which all possess exactly the same derivative (14.1). No unique time path can be deter¬ 
mined. therefore, unless the value of the constant c can somehow be made definite. To 
accomplish this, additional information must be introduced into the model, usually in the 
form of what is known as an initial condition or boundary condition. 

If wc have knowledge of the initial population H( 0)—that is, the value of H at t = 0, let 
us say, H{ 0) - 100—then the value of the constant c. can be made determinate. Setting 
t = 0 in (14,2), we get 


11(0) = 2(0)' 2 + c = c 


But if//(())= 100, then c= 100, and (14.2) becomes 

Hit) = 2r l/2 +- 100 ( 14 . 2 ') 

where the constant is no longer arbitrary. More generally, for any given initial population 
//(0), the time path will be 

H(t) = 2t l!2 + 11(0) ( 14 . 2 ") 

Thus the population size //at any point of time will, in the present example, consist of the 
sum of the initial population /*/( 0) and another term involving the time variable i. Such a 
time path indeed charts the complete itinerary of the variable H over time, and thus it truly 
constitutes the solution to our dynamic model. [Equation (14.1) is also a function olV. Why 
can't it be considered a solution as wellVJ 

Simple as it is. this population example illustrates the quintessence of the problems of 
economic dynamics. Given the pattern of behavior of a variable overtime, w'e seek to find 
a function that describes the time path of the variable. In the process, we shall encounter 
one or more arbitrary constants, but if we possess sufficient additional information in the 
form of initial conditions, it will be possible to definitize these arbitrary constants. 
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In the simpler types of problem, such as the one just cited, the solution can be found by 
the method of integral calculus, which deals with the process of tracing a given derivative 
function back to its primitive function. In more complicated cases, we can also resort to the 
known techniques of the closely related branch of mathematics known as differential equa¬ 
tions. Since a differential equation is defined as any equation containing differential or 
derivative expressions, (14,1) surely qualilies as one; consequently, by finding its solution, 
we have in fact already solved a differential equation, albeit an exceedingly simple one. 

Let us now proceed to the study of the basic concepts of integral calculus. Since wc dis¬ 
cussed differential calculus with x (rather than t) as the independent variable, for the sake 
of symmetry wc shall use x here, too. Fot convenience, however, we shal I in the present dis¬ 
cussion denote the primitive and derived functions by ff(x) and fix), respectively, rather 
than distinguish them by the use of a prime. 

14.2 Indefinite Integrals _ 

The Nature of Integrals 

It has been mentioned that integration is the reverse of differentiation. If differentiation of 
a given primitive function F(x) yields the derivative fix), we can “integrate” fix) to find 
F(x), provided appropriate information is available to definitize the arbitrary constant 
that will arise in the process of integration. The function /-'(x) is referred to as an integral 
(or antiderivative) of The function f(x). These two types of process may thus be likened to 
two ways of studying a family tree: integration involves the tracing of the parentage of the 
function f(x), whereas differentiation seeks out the progeny of the function Fix). But note 
this difference—while the (differentiable) primitive function F{x) invariably produces a 
lone offspring, namely, a unique derivative f(x), the derived function fix) is traceable to 
an infinite number of possible parents through integration, because if F(x) is an integral of 
/(x), then so also must be Fix ) plus any constant, as wc saw in (14.2). 

We need a special notation to denote the required integration of/( a) with respect to x. 
The standard one is 


j fix) dx 

The symbol on the left an elongated P' (with the connotation of sum, to he explained 
later)—is called the integral sign, whereas the fix) pan is known as the integrand (the 
function to be integrated), and the dx part—similar to the dx in the differentiation operator 
d/dx —reminds us that the operation is to be performed with respect to the variable x. 
However, you may also take f(x) dx as a single entity and interpret it as the differential 
of the primitive function Fix) [that is. d Fix) - fix) </x].Then. the integral sign in front 
can be viewed as an instruction to reverse the differentiation process that gave rise to the 
differential. With this new notation, wc can write that 

^-F(x) = f(x) => jf(x)dx — F(x)+c ( 14 . 3 ) 

where Ihe presence of c, an arbitrary constant of integration, serves to indicate the multiple 
parentage of the integrand. 
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Example 1 


Example 2 


Example 3 


Example 4 


Example 5 


The integral ff(x) dx is, more specifically, known as the indefinite integral af fix) (as 
against the definite integral to be discussed in Sec. 14.2), because it has no definite numer¬ 
ical value. Because it is equal to F(x) + c , its value will in general vary with the value of 
.r (even if c is definitized). Thus, like a derivative, an indefinite integral is itself a function 
of Ihe variable x. 


Basic Rules of Integration 

Just as there are rules of derivation, we can also develop certain rules o ('integration. As may 
be expected, the latter are heavily dependent on the rules of derivation with which we are 
already familiar. From the following derivative formula for a power function, 


d f x” +l 
dx ln + 1 


= x 


.ft 




for instance, we see that the expression x”~ l /(n + 1) is the primitive function for the 
derivative function x n ; thus, by substituting these for F(x) and f(x) in (14.3), we may 
slate the result as a rule of integration. 


Rule 1 (the power rule) 

l x ’ dl = ttTl- x "'' +c < " 7 ^ 1) 


Find fx^dx. Here, we have n = 3, and therefore 

1 


x l dx = -x A +c 
4 


Find fxdx. Since n = 1, we have 


xdx = ^ x 2 + c 


What is /1 dx? To find this integral, we recall that x° = 1; so we can let n = 0 in the power 
rule and get 


1 dx = x +: 

[/1 dx is sometimes written simply as / dx, since 1 dx = dx,] 
Find / \fx 3 dx. Since Vx* = x l 2 f we have n = ); therefore, 


v 


! x 3 dx = 


, 5/2 

2 


c = - -he 


Find J (x ^ 0). Since ]/x A = x 4 , we have n = -4. Thus the integral is 


1 A * 

— dx = 


- 4+1 1 

+ C = — -—7 + C 


-4 + 1 


3x 3 


Note that the correctness of the results of integration can always be checked by differ¬ 
entiation; if the integration process is correct, the derivative of the integral must be equal to 
the integrand. 
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The derivative formulas for simple exponential and logarithmic functions have been 
shown to be 



„ d , 1 

and —In-Y = - 
dx x 


{x > 0) 


from these, two other basic rules of integration emerge. 

Rule 11 (the exponential rule) 


dx = e x + c 


Rule HI (the logarithmic rule) 

J-dx = \nx+c (x > 0] 

It is of interest that the integrand involved in Rule tit is 1 jx = x - ', which is a special 
form of the power function x" with n = - ] .This particular integrand is inadmissible under 
the power rule, but now is duly taken care of by the logarithmic rule. 

As stated, the logarithmic rule is placed under the restriction x > 0. because logarithms 
do not exist for nonpositive values of x. A more general formulation of the rule, which can 
take care of negative values ofx, is 


- dx = In lx I + c 


(x t 0) 


which also implies that {d/dx) In |x| = 1/x, just as (d/dx) lnx = 1/x. You should con¬ 
vince yourself that the replacement ofx (with the restriction x > 0) by |x| (with the 
restriction x ^ 0) does not vitiate the formula in any way. 

Also, as a matter of notation, it should be pointed out that the integral / dx is 

f dx J x 

sometimes also written as / — . 


As variants ofRules II and III, we also have the following two rules. 


Rule Ha 


f(x)e fm dx = e l( * ] + £ 


Rule Ilia 


or 


fix) 

—+ dx = In f(x)+c \f(x) > 0] 

./(*) 

ln|/(x)| + c [fix) 0] 


The bases for these two rules can be found in the derivative rules in (10.20). 


Rules of Operation 

The three preceding rules amply illustrate the spirit underlying all rules of integration. Each 
rule always corresponds to a certain derivative formula. Also, an arbitrary constant is 
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Example 6 


Example 7 


always appended at the end {even Though it is to be defmitLzed later by using a given bound- 
ary condition) to indicate that a whole family of primitive functions can give rise to the 
given integrand. 

To be able to deal with more complicated integrands, however, wc shall also find the 
following two rules of operation with regard to integrals helpful. 

Rule IV (the integral of a sum) The integral of the sum of a finite number of functions 
is the sunt of the integrals of those functions. For the two-function case, this means that 

[lf(x) +£(*)] dx = f/( x) dx + f gl.r) dx 


This rule is a natural consequence of the fact that 

7 -[*X*) + G(j:)J= ^-F(x) + ~G(x) = J\x) + g{x) 
ax dx dx ' -,-- 

3 ' ' B ' 

Inasmuch as A = C, on the basis of (14.3) wc can write 

j\f{x) + g(x)\dx = F{x) + G(x) + c 
But, from the fact that S = C, it follows that 

J f{x) dx = F(x)+c { and j g(x)dx = G(x)+C 2 

Thus we can obtain (by addition} 


(14.4) 


f{x) dx + / g(.r) dx = F(x) + G(x) 4- + e 2 


(14.5) 


Since the constants c, c\, and c 2 arc arbitrary in value, we can let c = ci + q. Then the 
right sides of (14.4) and (14.5) become equal, and as a consequence, their left sides must 
be equal also. This proves Rule IV 

Find J (x 3 + x + 1) dx. By Rule IV, this integral can be expressed as a sum of three integrals: 
/ x 3 dx + j xdx + f 1 dx. Since the values of these three integrals have previously been 
found in Examples 1, 2, and 3, we can simply combine those results to get 

j(x 3 + X + 1 ) dx = + C, ^ + C 2 ^ + (x + Ci) = j + y + X + C 

In the final answer, we have lumped together the three subscripted constants into a single 
constant c. 


As a general practice, ail the additive arbitrary constants of integration that emerge dur¬ 
ing the process can always be combined into a single arbitrary constant in the final answer. 

Find J -i- j dx. By Rule IV, we can integrate the two additive terms in the 

integrand separately, and then sum the results. Since the 2e 2x term Is In the format of 
f'(x)e ax) in Rule lla, with f(x) = 2x, the integral is e 2> f C]. Similarly, the other term, 



450 Part Five Dynamic Analysis 


14x/(7x 2 -b5), takes the form of f'(x)/f(x), with f(x) = 7x 2 + 5 > O.Thus, by Rule Ilia, the 
integral is In (7x 2 + 5) + C 2 - Hence we can write 

J + jx*+ 5 ) = ^ + ln ^* 2 + 5) + c 

where we have combined ci and C 2 into one arbitrary constant c. 


Rule V (the integral of a multiple) The integral of k times an integrand ( k being a con¬ 
stant) is Climes the integral olThat integrand, fn symbols, 

j kf{x)dx=k jfix)dx 

What this rule amounts to, operationally, is that a multiplicative constant can be “factored 
out” of the integral sign. (Warning: A. variable term cannot be factored out in this fashion!) 
To prove this rule (for the case where k is an integer), we recall that k times f(x) merely 
means adding f(x) k times; therefore, by Rule IV, 


kf(x) dx = / [f{x] + f{x) -h - * + f{x) 1 dx 


k lums 

= ff(x)dx+ (fU)dx 


f(x)dx = k j f(x)dx 


k 


Example 8 Find f-f(x) dx. Here k = -1, and thus 

J -f(x) dx = - J !(x) dx 

That is, the integral of the negative of a function is the negative of the integral of that 
function. 


Example 9 Find J'2* 2 dx. Factoring out the 2 and applying Rule I, we have 

' 2 


2x 2 dx = 2 I x 2 dx = 2( y +ci j = |x 3 + c 


Example 10 Find /^* 2 ^ x • * n this case, factoring out the multiplicative constant yields 

fix 2 dx = 3 fx 2 dx = 3 (y + c,) = x 3 + c 


Note that, in contrast to the preceding example, the term x i in the final answer does not 
have any fractional expression attached to it. This neat result is due to the fact that 3 (the 
multiplicative constant of the integrand) happens to be precisely equal to 2 (the power of 
the function) plus 1. Referring to the power rule (Rule I), we see that the multiplicative con¬ 
stant (n +1) will in such a case cancel out the fraction 1 /(n + 1), thereby yielding (a^ 1 + c) 
as the answer. 


In general, whenever we have an expression (n + l)x" as the integrand, there is really 
no need to factor out the constant (/t + 1) and then integrate x n ; instead, wc may write 
x" 1 -f c as the answer right away. 
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Example 11 


Example 12 


Find j ^5e* - x 2 +- - j dx, (x ^ 0). This example illustrates both Rules IV and V; actually, 
it illustrates the first three rules as well: 

J ^5e' - + - j dx = 5 j e” dx - j x 2 dx + 3 J - dx [by Rules IV and V] 


= (Se x + Ci) - p e- c 2 1 + (3 In |x| + a) 
= 5e* + - + 3 In 'xl + c 


The correctness of the result can again be verified by differentiation. 


Rules Involving Substitution 

Now we shall introduce two more rules of integration which seek to simplify the process 
of integration, when the circumstances are appropriate, by a substitution of the original 
variable of integration. Whenever Ihe newly introduced variable of integration makes the 
integration process easier than under the old. these rules will become of service. 


Rule VI (the substitution rule) The integral of f(u)(du/dx) with respect to the vari¬ 
able „r is the integral of/(«) with respect to the variable u: 


f du 

l m Tx dx = 

where the operation j du has been substituted for the operation [dx. 

This rule, the integral-calculus counterpart of the chain rule, may be proved by means of 
the chain rule itself. Given a function F(u), where u = u(x). the chain rule states that 


f(u)du = F(u)+c 


d d du du du 

— F(w) = -j-F(u)— = h («)— = ,f(u) — 
ux du dx dx dx 

Since f(u)(du/dx) is the derivative of F(u), it follows from (14.3) that the integral (anti¬ 
derivative) of the former must be 

/ du 

f{u)—dx = F(u) + c 


You may note that this result, in fact, follows also from the canceling of the two dx expres¬ 
sions on the lelt. 


Find f2x(x 2 + ~\)dx. The answer to this can be obtained by first multiplying out the 
integrand: 


2x{x 2 + 1 )dx= (2x 3 + 2x)dx = — + x 2 + c 


but let us now do it by the substitution rule. Let u = x 2 + l; then du/dx = 2x, or 
dx = du/2x. Substitution of du/2x for dx will yield 


f2x(x z + l)dx = j 


du 

2xu— = 
2x 


, u 2 

udu= ■ t ci 


= ^(x 4 ■+ 2x 2 +■ 1) 4- ci = lx 4 +x 2 + c 

where c= \ +ci. The same answer can also be obtained by substituting du/dx for 2x 
(instead of du/2x for dx). 
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Example 13 


Example 14 


Find J’6x 2 (x 3 + 2)" dx. The integrand of this example is not easily multiplied out, and thus 
the substitution rule now has a better opportunity to display its effectiveness. Let 
u = x 3 + 2; then du/dx = 3x 2 , so that 


[e*V+2 )"dx=f (2^)u"d*=l 


2u" du 


2 

TOO 


.100 


+ c = 


]_ 

50 


(x 3 + 2) 10 ° 


+ c 


Find /8e 2,+3 dx. Let u = 2x + 3; then du/dx = 2, or dx = du/2. Hence, 

J 8e 2 ' 1 3 dx = f = / >je u du = 4e u + c = 4e 2 * 13 4- c 

As these examples show, this rule is of help whenever we can—by the judicious choice 
of a function u - u(x) -express the integrand (a function of x) as the product of /(») 
(a function of u) and du/dx (the derivative of the u function which we have chosen). How ¬ 
ever, as illustrated by the last two examples, this rule can be used also when the original 
integrand is transformable into a constant multiple of f(u)(du/dx). This would not affect 
the applicability because the constant multiplier can be factored out of the integral sign, 
which would then leave an integrand of the form f{u)(du/dx), as required in the substitu¬ 
tion rule. When the substitution of variables results in a variable multiple of f(u)(du/dx), 
say,.* times the latter, however, factoring is not permissible, and this rule will be of no help. 
In fact, there exists no general formula giving the integral of a product of two funclions in 
terms of the separate integrals of those functions; nor do we have a general formula giving 
the integral of a quotient of two functions in terms of their separate integrals. Herein lies 
the reason why integration, on the whole, is more difficult than differentiation and why, 
with complicated integrands, it is more convenient to look up the answer in prepared tables 
of integration formulas rather than to undertake the integration by oneself. 

Rule VII {integration by parts) The integral of v with respect to u is equal to uv less 
the integral of u with respect to v: 


vdu = uv — 


u dv 


The essence of this rule is to replace the operation j du by the operation j dv. 

The rationale behind this result is relatively simple. First, the product rule of differen¬ 
tials gives vis 

d(u i>) = vdu + u dv 

If we integrate both sides of the equation (i.e., integrate each differential), we get a new 
equation 


d(uv) = f vdu T- j udv 


or 


j - j v du + J u dv [no constant is needed on the left (why?)] 


Then, by subtracting j udv from both sides, the previously stated result emerges. 
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Example 15 /*(* + 1) /2 dx. Unlike Examples 12 and 13, the present example is not amenable to 

- the type of substitution used in Rule VI. (Why?) However, we may consider the given inte¬ 
gral to be in the form of j'vdu, and apply Rule VII. To this end, we shall let v = x, implying 
dv = dx, and also let u= f(x +1) 3 2 , so that du = (x +1) 1 ' 2 dx. Then we can find the 
integral to be 

I x(x + 1 )' i2 dx= I vdu = uv- f u dv 


= ~(x + rf / 2 x~ J I) 3 '’ 2 dx 

= |(X + 1) 3/2 X~^(*-f-l) i/2 + C 


Example 16 ^' ncl ] iri xdx,( x > 0). We cannot apply the logarithmic rule here, because that rule deals 

- with the integrand 1 jx, not In x. Nor can we use Rule VI. But if we let v = In x, implying 

dv = (1 ix) dx, and also let u = x, so that du = dx, then the integration can be performed 
as follows: 

f In x dx = f v du = uv - [ udv 


= xlnx - I dx = x\fl x - x A- c = x(lnx - 1) + c 


Example 17 Find f xe * ^ x - ln ttais case - we sli all si^Ply let ^ and u = e", so that dv = dx and 
- du = e* dx. Applying Rule VII, we then have 


xe‘dx = I vdu=uv- / udv 


seV- j e x dx = e*x - e* A- c = e*(x -1) + c 

The validity of this result, like those of the preceding examples, can of course be readily 
checked by differentiation. 


EXERCISE 14.2 




1. Find the following: 




(i o ) j 1 6x~* dx (x t* 0) 

(d) > 

f2e' 2x dx 

(i>) j 9x 8 dx 


{e) ) 

f 4x . 
i—- dx 

' X 2 + ] 

(c) j (x s - 3x) dx 


(,, j 

1 (2ax + b)(ax 2 + bx) 7 dx 

2. Find: 




(a) j 13?* dx 


(d) i 

( 3e -DH7) dx 

(b) j + 

(x > 0) 

J 

(e) ) 

( 4xe ,M dx 

(c,/ 5e - + AV 

(**0) 


f xe* 2 ' s dx 
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3. 


4. 


5. 


Find: 


f 3 dx 



f 2x , 

w/ v 

(*7*0) 




(*7*2) 

Wj 

( TT—r dx 

3x 2 + 5 

Find: 




(i a ) J (x + 3)(x + 1 ) 1/2 dx 

m. 

( x tn xdx (x > 0) 

Given n constants k, (with / 

- 1,2,..., n) and n functions ffix), deduce from Rules IV 

and V that 





ft 


MOOd* = £*i 

;=1 f =1 


/j(x) dx 


14.3 Definite Integrals ___ 

Meaning of Definite Integrals 

All the integrals cited in Sec. 14.2 are of the indefinite variety: each is a function of a vari¬ 
able and, hence, possesses no definite numerical value. Now. for a given indefinite integral 
of a continuous function f(x). 


fix) dx 


F{x) + c 


Example 1 


if we choose two values of.v in the domain, say. a and b (a < b). substitute them succes¬ 
sively into the right side of the equation, and form the difference 

[F(6) + c]-[n«)+r] = F(/>)-F(r/) 


we get a specific numerical value, free of the variable x as well as the arbitrary constant c. 
This value is called the definite integral of f(x) from a to b. We refer to a as the lower limit 
of integration and to b as the upper limit of integration. 

In order to indicate the limits of integration, we now modify the integral sign to the form 
r h 

/ . The evaluation of the definite integral is then symbolized in the following steps: 


ib 


f(x) dx - F(x) 


= F{b) - F{a) 


Ju 


(14.6) 


where the symbol ] h a {also written |* or [• • •]*) is an instruction to substitute b and a, suc¬ 
cessively, for x in the result of integration to get F(b) and Flu), and then take their 
difference, as indicated on the right of (14.6). As the first step, however, we must find the 
indefinite integral, although we may omit the constant c, since the latter will drop out in the 
process of difference-taking anyway. 


Evaluate f 3x 2 dx. Since the indefinite integral is x 3 + c, this definite integral has the value 
t 

5 


5 

3x 2 dx = x 3 

i ji 


= (5) 3 (I) 5 = 125-1 =124 
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Example 2 


Example 3 


Example 4 


ft> 

Evaluate / ke x dx. Here, the limits of integration are given in symbols; consequently, the 
Ja 

result of integration is also in terms of those symbols; 


ke*dx = ke 




Jo 


k(e b - e a ) 


Evaluate f (- - i2x ) dx, (x ^ -1). The indefinite integral is In |1 +*| 4 - x 2 + c; thus 

Jo \f + * 
the answer is 


f(,4^)-[ |n 


+ X| +X‘ 


i4 


Jo 


= (In5 f 16) - (Ini 4 - 0) 

= In 5 +16 [since In 1 = 0] 


It is important to realize that the limits of integration a and h both refer to values of the 
variable x. Were we to use the substitution-of-variables technique (Rules VI and VII) dur¬ 
ing integration and introduce a variable u , care should be taken not to consider a and b as 
the limits of Example 4 will illustrate this point. 


Evaluate / (2x 3 -1) 2 (6 x 2 )dx. Let u- 2x l - 1; then du/dx = 6x 2 , or du = 6x 2 dx. Now 

Ji 

notice that, when x = 1, u will be 1 but that, when x = 2, u will be 15; in other words, 
the limits of integration in terms of the variable u should be 1 (lower) and IS (upper). 

Rewriting the given integral in u will therefore give us not [ u 2 du but 


15 

u 2 du = 



is 


1 


-(15 3 - l 3 ) = 1,124$ 


Alternatively, we may first convert u back to x and then use the original limits of 1 and 2 to 
get the identical answer: 



w—15 


U-1 


l(2x 3 -1) 3 



I 3 ) = 1,124 j 


A Definite Integral as an Area under a Curve 

Every definite integral has a definite value. That value may be interpreled geometrically to 
be a particular area under a given curve. 

The graph of a continuous function y = f(x) is drawn in Fig. 14.1. If we seek to mea¬ 
sure the (shaded) area A enclosed by the curve and the x axis between the two points a and b 
in the domain, we may proceed in the foliowing manner. First, we divide the interval [«. b] 
into n subintcrvals (not necessarily equal in length). Four of these are drawn in Fig. 14.1 a- - 
that is, n = 4 the first being [x \, .vs] and the last. [.v 4 , * 5 ], Since each of these represents 
a change in x, wc may refer to them as A.vi,..., A.v 4 , respectively. Now, on the subinter¬ 
vals let us construct four rectangular blocks such that the height of each block is equal 
to the highest value of the function attained in that block (which happens to occur at 
the left-side boundary of each rectangle here). The first block thus has the height }{x ,) and 
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FIGURE 14.1 




m 

ihe width Ax,, and, in general, the /th block has the height /(x,) and the width Ax,. The 
total area A* of this set of blocks is the sum 

n 

A' = ./(av) Ax,- (n = 4 in Fig.14.la) 

(=i 

This, though, is obviously not the area under the curve we seek, but only a very rough 
approximation thereof. 

What makes A* deviate from the true value of A is the unshaded portion of the rectan¬ 
gular blocks; these make A* an over-estimate of A, If the unshaded portion ean be shrunk 
in size and be made to approach zero, however, the approximation value A * will corre¬ 
spondingly approach the true value A. This result will materialize when wc try a finer and 
finer segmentation of the interval [a. 6], so that n is increased and Ax, is shortened indefi¬ 
nitely. Then the blocks will become more slender (if more numerous), and the protrusion 
beyond the curve will diminish, as can be scon in Fig. 14.16. Carried lo the limit, this 
“slenderizing” operation yields 

n 

lim V f(x,) Ax,- = lim A * = area A 

w->og 

/=1 


(14.7) 
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provided this limit exists. (It does in the present case.) This equation, indeed, constitutes the 
formal definition of an area under a curve. 

n 

The summation expression in (14,7). ^ /(*,-) A*,-, bears a certain resemblance to the 

t-h i=\ 

definite integral expression / f(x)dx. Indeed, the latter is bused on the former. The 

i hi 

replacement of Ax,- by the differential dx is done in the same spirit as in our earlier discus¬ 
sion of “approximation" in Sec. 8.1. Thus, we rewrite fix,) Ax, into / ix ) dx. What about 

n 

the summation sign? The ^ notation represents the sum of a /im/^ number of terms, When 

i=\ 

wc let n -> co, and take the limit of that sum, the regular notation for such an operation is 

f h 

rather cumbersome. Thus a simpler substitute is needed. That substitute is / , where the 

Ja 

elongated,S’symbol also indicates a sum, and where a and b (just as/ - 1 and a) serve to 
specify the lower and upper limits of this sum. In short the dciinitc integral is a shorthand 
for the limit-of-a-sum expression in (14.7). That is. 


n 

T—' 


/( x) dx = lim ) f{x t ) Ax< = area A 




Thus the said definite integral (referred to as a Riemann integral) now h3S an area conno¬ 
tation as well as a 
discrete concept of 

/=l 

In Fig. 14.1, we attempted to approximate area A by systematically reducing an over¬ 
estimate A* by finer segmentation of the interval [a. b]. The resulting limit of the sum of 
block areas is called the upper integral —an approximation from above. We could also have 
approximated area A from below by forming rectangular blocks inscribed by the curve 
rather than protruding beyond it (see Exercise 14.3-3). The total area A 4X of this new' set of 
blocks will underestimate A , but as the segmentation of [a, h] becomes finer and finer, wc 
shall again find lim A** = A. The last-cited limit of the sum of block areas is called the 

lower integral. If. and only if. the upper integral and lower integral are equal in value, then 


sum connotation, because / is the continuous counterpart of the 


E 


the Riemann integral / f(x) dx is defined, and the function j\x) is said to be Riemann 
Jo 

integrate. There exist theorems specifying the conditions under which a function fix) is 
integrable. According to the fundamental theorem of calculus, a function is integrate in 
[a, 6 ] if it is continuous in that interval. As long as wc arc working with continuous func¬ 
tions. therefore, we should have no worries in this regard. 

Another point may be noted. Although the area A in Fig. 14.1 happens to lie entirely 
under a decreasing portion of the curve y = fix), the conceptual equaling of a dciinitc in¬ 
tegral with an area is valid also for upward-sloping portions of the curve. In faci, both types 


of slope may be present simultaneously; e.g., wc can calculate 
under the curve in Fig. 14.1 above the lire Ob. 


f(x) dx as the area 
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FIGURE 14.2 



Note that, if wc calculate the area B in Fig. 14.2 by the definite integral / f(x) dx s the 

Ju 

answer will come out negative, because the height of each rectangular block involved in 
this area is negative. This gives rise to the notion of a negative area , an area that lies below 
thc.r axis and above a given curve. In case wc are interested in the numerical rather than the 
algebraic value of such an area, therefore, wc should take the absolute value of the relevant 

definite integral. The area C = J if) dx, on the other hand, has a positive sign even 

though it lies in the negative region of the x axis; this is because each rectangular block has 
a positive height as well as a positive width when we are moving from c to d. From this, the 
implication is clear that interchange of the two limits of integration would, by reversing the 
direction of movement, alter the sign of Ax, and of the definite integral. Applied to area B. 

we see that the definite integral j fix) dx (from b to a) will give the negative of the area 

B; this will measure the numerical value of this area. 


Some Properties of Definite Integrals 

The discussion in the preceding paragraph leads us to the following property of definite 
integrals. 


Property I The interchange of the limits of integration changes the sign of the definite 


/ fix) dx = - / fix ) dx 

J]> Ju 

This can be proved as follows: 

J /(.v) dx = F(a) - F(b) = ~[F(k) - F(a)J = - jf fix) dx 

Definite integrals also possess some other interesting properties. 

Property II A definite integral has a value of zero when the two limits of integration are 
identical: 


f{x) dx = F(a) - F(a) = 0 
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FIGURE 14.3 


Under the "area" interpretation, this means that the area (under a curve) above any sin¬ 
gle point in the domain is nil. This is as it should be, because on top of a point on then axis, 
we ean draw only a (one-dimensional) line, never a (two-dimensional) area. 

Property III A definite integral can be expressed as a sum of a finite number of definite 
subintegrals as follows: 

f f(x) dx = j f{x) dx + f fix) dx + / fix) dx (a < h < c < d) 

Ja Ja Jb J< 

Only three subintegrals are shown in this equation, but the extension to the case of n 
subintegrals is also valid. This property is sometimes described as the additivity property. 

In terms of area, this means that the area (under the curve) lying above the interval [a, d] 
on the j: axis can be obtained by summing the areas lying above the subintervals in the set 
{[a.6].[6, c], [c. hj). Note that, since we are dealing with closed intervals, the border 
points b and c have each been included in two areas. Is this not double counting? It indeed 
is. But fortunately no damage is done, because by Property 11 the area above a single point 
is zero, so that the double counting produces no effect on the calculation. But, needless to 
say, the double counting of any interval is never permitted. 

Earlier, it was mentioned that all continuous functions are Riemann integrate. Now, by 
Property HI, wc can also find the definite integrals (areas) of certain discontinuous func¬ 
tions. Consider the step function in Fig, 14,3a, In spite of the discontinuity at point b in the 
interval [a, c], wc can find the shaded area from the sum 

f fix) dx + [fix) dx 
Ja Jb 


The same also applies to the curve in Fig. 14.36. 

Property IV 


— fix) dx = — / f(x) dx 


Property V 



*/(jr) dx = k 


b 


fix) dx 



{a) (b) 
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Property VI 


ph fib ph 

I [/(*) +«(*)]<(*- / flx)dx + / x{x)dx 

a Ju Jo 


Property VII (integration by parts) Given w(.r) and v(x). 


C' h \\=(y 

I u du — uv 

X- u 


‘X -h 


u dv 


These last four properties, all borrowed from the rules of indefinite integration, should 
require no further explanation. 

Another Look at the Indefinite Integral 

We introduced the definite integral by way of attaching two limits of integration to an 
indefinite integral. Now that we know the meaning of the definite integral, let us see how 
we can revert from the latter to the indefinite integral. 

Suppose that, instead of fixing the upper limit of integration at b, we allow it to be a 
variable, designated simply as.r. Then the integral will take the form 

f(x)dx = Fix) - F(a) 

which, now being a function ofx, denotes a variable area under the curve of /(a). But 
since the last term on the right is a constant, this integral must be a member of the family 
of primitive functions of fix), which we denoted earlier as F(x) + c. If we set c = -Fla). 
then the above integral becomes exactly the indefinite integral J f\x) tlx. 

from this point of view, therefore, we may consider the / symbol to mean the same as 

l 

, provided it is understood that in the latter version of the symbol the lower limit of 
integration is related to the constant of integration by the equation c = —F(a). 



EXERCISE 14.3 

1, 


2. 


Evaluate the following; 

(b) f x(x 2 + 6) dx 

Jo 

(c) j 3s/x dx 
Evaluate the following: 



- 2x dx 
dx 

x + 2 


p 4 

(d) j (x 3 -6x 2 ) dx 
(?) j {ax 2 + bx + c) dx 


(c) j (e 2> + e*) dx 


dx 
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3. in Fig. 14.1a, take the lowest value of the function attained in each subinterval as the 
height of the rectangular block, i.e., take f(x 2 ) instead of f(*i) as the height of the first 
block, though still retaining A*i as its width, and do likewise for the other blocks. 

(a) Write a summation expression for the total area 4’* of the new rectangles. 

(b) Does 4’' overestimate or underestimate the desired area 4? 

(c) Would A " tend to approach or to deviate further from A if a finer segmentation of 
[ 0 , b] were introduced? (Hint: Try a diagram.) 

( d ) In the limit, when the number n of subintervals approaches 00 , would the approxi¬ 
mation value A" approach the true value A just as the approximation value 4* did? 

(e) What can you conclude from ( 0 ) to (d) about the Riemann integrability of the 
function f(x) in Fig. 14.1a? 


4. The definite integral j f(x) dx is said to represent an area under a curve. Does this 

curve refer to the graph of the integrand fix), or of the primitive function F(x)? If we 
plot the graph of the F(x) function, how can we show the given definite integral on 
it—by an area, a tine segment, or a point? 

5. Verify that a constant c can be equivalently expressed as a definite integral: 

( 0 ) c = / £ dx (b) c a [ 1 dt 

h b Jn 


roper Intearals 


Certain integrals arc said to be '‘improper." We shall briefly discuss two varieties thereof. 

Infinite Limits of Integration 

When we have definite integrals of the form 


f(x)dx and 


f(x ) dx 


with one limit of integration being infinite, we refer to them as impmper integrals. In these 
cases, it is not possible to evaluate the integrals as. respectively, 

h'(oc)-F(a) and F(b) - F(-oc) 

because co is not a number, and therefore it cannot be substituted for x in the function 
F(x). Instead, we must resort once more to the concept of limits. 

The first improper integral we cited can be defined to be the limit of another (proper) 
integral as the latter's upper limit of integration tends to 00 ; thai is. 

f f{x)dx = lim f fix) dx (14.8) 

if this limit exists, the improper integral is said to be convergent (or to converge), and the 
limiting process will yield the value of the integral. If the limit does not exist, the improper 
integral is said to be divergent and is in fact meaningless, By the same token, we can define 


f(x) dx = lim / f(x)dx 


(14.8') 


with the same criterion of convergence and divergence. 
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Example 1 


Evaluate I —?. First we note that 

l * 2 


dx -1 


-,b 


1 » 


-1 

= T + 


Hence, in line with (14.8), the desired integral is 


*dx [ b dx /-I , 

= hm / -5- = lim — + i 
i X 2 J 1 x 2 V b 




This improper integral does converge, and it has a value of 1. 

Since the limit expression is cumbersome to write, some people prefer to omit the "lim" 
notation and write simply 


00 dx _ -1 

X 2 X 


= 0 + 1=1 


Ji 


Even when written in this form, however, the improper integral should nevertheless be 
interpreted with the limit concept in mind. 

Graphically, this improper integral still has the connotation of an area. But since the 
upper limit of integration is allowed to take on increasingly larger values in this case, 
the right-side boundary must be extended eastward indefinitely, as shown in Fig. 14.4a 
Despite this, we are able to consider the area to have the definite (limit) value of 1. 


Example 2 


Evaluate 


» dx 

- . As before, we first find 

X 


b dx 

= Inx =s In b- Ini = In b 

x i 


When we let o c, by (10.16') we have Inb^ so, Thus the given improper integral is 

divergent. 

Figure 14.4b shows the graph of the function 1 jx, as well as the area corresponding to 
the given integral. The indefinite eastward extension of the right-side boundary will result 
this time in an infinite area, even though the shape of the graph displays a superficial 
similarity to that of Fig. 14.4a. 


FIGURE 14.4 
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Example 3 


Example 4 


What if both limits of integration are infinite? A direct extension of (14.8) and (14.8' 
would suggest the definition 


cc 


—CM 


f(x) dx - lim / f{x) dx 




(14.8") 


Again, this improper integral is said to converge if and only if the limit in question exists. 

Infinite Integrand 

Even with finite limits of integration, an integral can still be improper if the integrand be¬ 
comes infinite somewhere in the interval of integration [a, ft]. To evaluate such an integral, 
we must again rely upon the concept of a limit. 

f 1 1 

Evaluate / - dx. This integral is improper because, as Fig. 14.4b shows, the integrand is 

J 0 * 

infinite at the lower limit of integration (1/x cc as x 0"). Therefore we should first 
find the integral 


i 1 

- dx = Inx 
x 


-ii 


= in 1 - In o = - In o [for o > 0] 
and then evaluate its limit as a 0 : : 

f 1 1 r 1 

/ -dx= lim / - dx = lim ( Ina) 

JO x fl-0 1 Jo X 

Since this limit does not exist (as a 0 + , In a —co), the given integral is divergent 

f 9 

Evaluate / *" 1/2 dx, When x -> 0 + , the integrand ^/</x becomes infinite; the integral is 

Jo 

improper. Again, we can first find 


■' 1/2 dx = 2x ]/2 


-i9 


= 6 - 2-fa 


J a 


The limit of this expression as a -* 0 + is 6 - 0 = 6. Thus the given integral is convergent 
(to 6). 


The situation where the integrand becomes infinite at the upper limit of integration is 
perfectly similar. It is an altogether different proposition, however, when an infinite value 
of the integrand occurs in the open interval (a, h) rather than at a or b. In this eventuality, 
it is necessary to take advantage of the additivity of definite integrals and first decompose 
the given integral into subintegrals. Assume that f(x) oo as x -*■ p, where p is a point 
in the interval (a, by. then, by the additivity property, we have 

[ f(x) dx = f fix) dx + f f(x) dx 

J(l J(t J p 

The given integral on the left can be considered as convergent if and only if each subinte¬ 
gral has a limit. 
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Example 5 


r i 

Evaluate I -j dx. The integrand tends to infinity when x approaches zero; thus we must 
write the given integral as the sum 

/•l pQ p I 

J x 3 dx = J x ~ 3 dx - J x ' 3 dx (say, = /i 4- h) 


The integral /i is divergent, because 


lim / x 3 dx = lim 

,/ j b—0 


-1 


b 

J-1 


, ( i i 

= lim “To + T I = "SC 
6 ^ 0 - \ 2b 2 2 


Thus, we can conclude immediately, without having to evaluate I 2 , that the given integral 
is divergent. 


EXERCISE 14.4 


1. Check the definite integrals given in Exercises 14.3-1 and 14.3-2 to determine whether 
any of them is improper, if improper, indicate which variety of improper integral each 
one is. 

2. Which of the following integrals are improper, and why? 

( 0 ) / e~ rt dt (d) / e rt dt 

Jo J -» 

// * <*>f h 

p] 

(c) / x~ m dx 
Jo 

3. Evaluate all the improper integrals in Prob. 2. 

4. Evaluate the integral I 2 of Example 5, and show that it is also divergent. 

5. (a) Graph the function y = ce _! for nonnegative r, (c > 0), and shade the area under 

the curve. 

(b) Write a mathematical expression for this area, and determine whether it is a finite 
area. 


(f) J 6dx 


14.5 Some Economic Applications of Integrals _ 

Integrals are used in economic analysis in various ways. We shall illustrate a few simple 
applications in the present section and then show the application to the Pomar growth 
model in Sec. 14.6. 

From a Marginal Function to a Total Function 

Given a total function (e.g., a total-cost function), the process of differentiation can yield 
the marginal function (e.g., the marginal-cost function). Because the process of integration 
is the opposite of differentiation, it should enable us, conversely, to infer the total function 
from a given marginal function. 


j 
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Example 1 


If the marginal cost (MC) of a firm is the following function of output, C'(Q) = 2e° 2Q , and 
if the fixed cost is C f = 90, find the total-cost function C(Q). By integrating C'(Q) with 
respect to Q, we find that 

J 2e 020 dQ = Z^f 020 + C = 10e°- 20 + c (14.9) 

This result may be taken as the desired C(Q) function except that, in view of the arbitrary 
constant c, the answer appears indeterminate. Fortunately, the information that Cj. = 90 
can be used as an initial condition to definitize the constant. When Q = 0, total cost C 
will consist solely of C f . Setting Q = 0 in the result of (14.9), therefore, we should get a 
value of 90; that is, 10e° + c = 90. But this would imply that c = 90 - 10 = 80. Hence, the 
total-cost function is 


C(Q) = 10e O2Q + 80 

Note that, unlike the case of (14.2), where the arbitrary constant c has the same value as 
the initial value of the variable H(0), in the present example we have c = 80 but 
C(0) = C^ = 90, so that the two take different values. In general, it should not be assumed 
that the arbitrary constant c will always be equal to the initial value of the total function. 


Example 2 


If the marginal propensity to save (MP$) is the following function of income, 5'(V') = 
0.3 - 0.1 V 1 ' 2 , and if the aggregate savings S is nil when income Y is 81, find the saving 
function 5(f). As the MPS is the derivative of the 5 function, the problem now calls for the 
integration of S'(Y): 


S(T) - j (0.3 - 0.1 r 1/2 ) dY = 0.3T -0.2Y 1 ' 2 + c 

The specific value of the constant c can be found from the fact that 5 = 0 when Y = 81. 
Even though, strictly speaking, this is not an initial condition (not relating to Y = 0), substi¬ 
tution of this information into the preceding integral will nevertheless serve to definitize c. 
Since 

0 = 0.3(81)- 0.2(9) + c => c = -22.5 
the desired saving function is 

S(r) = 0.3 Y -0.2Y ]/2 - 22.5 

The technique illustrated in Examples I and 2 can be extended directly to other prob¬ 
lems involving the search for total functions (such as total revenue, total consumption) 
from given marginal functions. It may also be reiterated that in problems of this type the va¬ 
lidity of the answer (an integral) can always be checked by differentiation. 

Investment and Capital Formation 

Capital formation is the process of adding to a given stock of capital. Regarding this 
process as continuous overtime, we may express capital stock as a function of time. K(t). 
and use the derivative dKjdi to denote the rate of capital formation, 1 But the rate of capital 

' As a matter of notation, the derivative of a variable with respect to time often is also denoted by a 
dot placed over the variable, such as K = dK/dt, In dynamic analysis, where derivatives with respect 
to time occur in abundance, this more concise symbol can contribute substantially to notalional 
simplicity. However, a dot, being such a tiny mark, is easily lost sight of or misplaced; thus, great care 
is required in using this symbol. 
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formation at time t is identical with the rate of net investment flow at time /, denoted by 
/(f). Thus, capital stock K and net investment 1 are related by the following two equations: 


and 


dK 

it ='«'» 


*■(')= h(o<it= I -^- dl = 


dK 


The first of the preceding equations is an identity; it shows the synonymity between net 
investment and the increment of capital. Since /(r) is the derivative of Kit), it stands to 
reason that K(t) is the integral or antiderivative of /(f), as shown in the second equation. 
The transformation of the integrand in the latter equation is also easy to comprehend: The 
switch from / to dK/dt is by definition, and the next transformation is by cancellation of 
two identical differentials, i.c., by the substitution rule. 

Sometimes the concept of gross investment is used together with that of net investment 
in a model. Denoting gross investment by l x and net investment by I, we can relate them to 
each other by the equation. 

4 = / + 8K 

where <5 represents the rate of depreciation of capital and 8K, the rate of replacement 
investment. 


Example 3 


Suppose that the net investment flow is described by the equation l(t) = 3f 1/2 and that the 
initial capital stock, at time t = 0, is K (0). What is the time path of capital K ? By integrating 
I (t) with respect to t, we obtain 


K(t) = j I (t) dt = j 3 1 ] i2 dt = 2t m + c 

Next, letting t = 0 in the leftmost and rightmost expressions, we find K (0) = c. Therefore, 
the time path of /d is 

K(0 = 2f 3/2 +K(Q) (14.10) 

Observe the basic similarity between the results in (14.10) and in (14.2"). 


The concept of definite integral enters into the picture when one desires to find the 
amount of capital formation during some interval of time (rather than the time path of K). 
Since Jl(t) dt = K(t), we may write the definite integral 


l(t)dt = K(t) 


= K{b) - K(a) 


to indicate the total capital accumulation during the time interval [a, b]. Of course, this also 
represents an area under the l(t) curve. It should be noted, however, that in the graph of the 
K{t) function, this definite integral would appear instead as a vertical distance—more 
specifically, as the difference between the two vertical distances K{b) and K(a). (cf. Exer¬ 
cise 14.3-4.) 

To appreciate this distinction between K(t) and /(r) more fully, let us emphasize that 
capital K is a stock concept, whereas investment I is a flow concept. Accordingly, while 
K[t) tells us the amount of K existing at each point of time, /(/) gives us the information 
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about the rate oi' (net) investment per year (or per period of time) which is prevailing at 
each point of lime. Thus, in order to calculate the amount of net investment undertaken 
(capital accumulation), wc must first specify the length of the interval involved. This fact 
can also be seen when we rewrite the identity dK/dt = I(t) as dK = i(t) dt, which states 
that dK , the increment in K, is based not only on/(/), the rate of flow, but also on dt, the 
time that elapsed. It is this need to specify the time interval in the expression /(f) dt that 
brings the definite integral into the picture, and gives rise to the area representation under 
the /(f) —as against the K{t) curve. 

Example 4 lfnet investm ent is a constant flow at l(t) = 1,000 (dollars per year), what will be the total 

- net investment (capital formation) during a year, from t = 0 to f = 1 ? Obviously, the answer 

is $1,000; this can be obtained formally as follows: 

[ /(f) dt= I 1,000 dt = 1,000fl =1,000 

Jo Jo Jo 

You can verify that the same answer will emerge if, instead, the year involved is from f = 1 
to f = 2. 


Example 5 


If / (f) = 3t 1,2 (thousands of dollars per year)—a nonconstant flow—what will be the capi¬ 
tal formation during the time interval [1, 4], that is, during the second, third, and fourth 
years? The answer lies in the definite integral 


-H 


3f 1/2 dt = 2t 


3/2 


16-2 = 14 


On the basis of the preceding examples, we may express the amount of capital accumu¬ 
lation during the time interval [0, f], for any investment rate /(t), by the definite integral 


-i/ 


/(f) dt=K(t) 


= K(t)-K(0) 


Jo 


Figure 14.5 illustrates the case of the time interval [0, f q]. Viewed differently, the preceding 
equation yields the following expression for the time path K (f): 


K(t) = K( 0)+ /(f) df 

Jo 

The amount of K at any time f is the initial capital plus the total capital accumulation that 
has occurred since. 

FIGURE 14.5 l I 



a 


«- 

'a 


I 




468 Part Five Dvnamic Analysis 


Present Value of a Cash Flow 

Our earlier discussion of discounting and present value, limited to the case of a single 
future value K led us to the discounting formulas 

A = I + /) r [discrete case] 
and A = Ve~ n [continuous case] 

Now suppose that we have a stream or flow of future values—a series of revenues receiv¬ 
able at various times or of cost outlays payable at various times. How do we compute the 
present value of the entire cash stream, or cash flow? 

In th e discrete case, if we assume three future revenue figures /?,(/ = 1.2, 3) available 
at the end of the fth year and also assume an interest rate of / per annum, the present values 
of R, will be, respectively, 

*1(1+1)“' *2(1 + 0 3 * 3 (I+0 “ 3 

It follows that the total present value is the sum 

n = ^ * t (l + /)“' (14.11) 

(FI is the uppercase Greek letter pi, here signifying present.) This differs from the single- 
value formula only in the replacement of I 7 by R, and in the insertion of the I sign. 

The idea of the sum readily carries over to the case of a continuous cash flow, but in the 
latter context the £ symbol must give way, of course, to the definite integral sign. Consider 
a continuous revenue stream at the rate of R(l) dollars per year. This means that at t = (\ 
the rate of flow' is R(t]) dollars per year, but at another point offline t - h_ the rate will 
be R(t 2 ) dollars per year—with t taken as a continuous variable. At any point of time, 
the amount of revenue during the interval [/, t + dt] can be written us R(t) dt [cf. the 
previous discussion of dK = lit) dt]. When continuously discounted at the rate of r per 
year, its present value should be R(t)e " dt. If wc let our problem be that of finding the 
total present value of a 3-year stream, our answer is to be found in the following definite 
integral: 


n = 


R(t)e~ 1 ' 1 dt 


o 


(14.1V) 


This expression, the continuous version of the sum in (14.11), differs from the single-value 
formula only in the replacement of Fby R{t) and in the appending of the definite integral 
sign. 1 


f It may be noted that, whereas the upper summation index and the upper limit of integration are 
identical at 3, the lower summation index 1 differs from the lower limit of integration 0. This is 
because the first revenue In the discrete stream, by assumption, will not be forthcoming until t = 1 
{end of first year), but the revenue flow In the continuous case is assumed to commence immediately 
after t = 0. 


J 
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Example 6 


What is the present value of a continuous revenue flow lasting for / years at the constant 
rate of D dollars per year and discounted at the rate of r per year? According to (14.1V), we 
have 


n = / De~ n dt= D I e~ n dt=D 


,-rt 


-1 


,-rr 



l=y 


ri 


l -0 


-D 

r 


(e~ fY - 1) 




(14.12) 


Thus, n depends on D, r and y. If D = $3,000, r = 0.06, and y = 2, for instance, we have 
3,000 m, 

n = -ir^T 0 - e ) = 50,000(1 - 0.8869) = $5,655 [approximately] 

U.U6 

The value of n naturally is always positive; this follows from the positivity of D and r, as well 
as (1 - e~ ry ). (The number e raised to any negative power will always give a positive frac¬ 
tional value, as can be seen from the second quadrant of Fig. 10.3a.) 


Example 7 


In the wine-storage problem of Sec. 10.6, we assumed zero storage cost. That simplifying 
assumption was necessitated by our ignorance of a way to compute the present value of a 
cost flow. With this ignorance behind us, we are now ready to permit the wine dealer to 
incur storage costs. 

Let the purchase cost of the case of wine be an amount C, incurred at the present time. 
Its (future) sale value, which varies with time, may be generally denoted as V(t )—its present 
value being V(t)e el . Whereas the sale value represents a single future value (there can be 
only one sale transaction on this case of wine), the storage cost is a stream. Assuming this 
cost to be a constant stream at the rate of s dollars per year, the total present value of the 
storage cost incurred in a total of f years will amount to 


se dt=-( 1 -e ") 
o r 


(cf. (14.12)] 


Thus the net present value—what the dealer would seek to maximize—can be expressed as 


N(t) = V(t)e rt - -(1 - e~") -C= V(t)+- 


-rf 


SI 


r J 


,-n 


-C 


which is an objective function in a single choice variable t. 

To maximize N(t), the value of t must be chosen such that N\t) = 0. This first derivative is 

N'(t) = V’(t)e~ n - r l/(t) -|- -J e~" [product rule] 

= [l/'(t)-rV(t)-s]fi n 

and it will be zero if and only if 

V : (t) = rV(t) -+ s 


Thus, this last equation may be taken as the necessary optimization condition for the choice 
of the time of sale t*. 

The economic interpretation of this condition appeals easily to intuitive reasoning: V"(f) 
represents the rate of change of the sale value, or the increment in V, if sale is postponed for 
a year, while the two terms on the right indicate, respectively, the Increments in the interest 
cost and the storage cost entailed by such a postponement of sale (revenue and cost are 
both reckoned at time f*). So, the idea of the equating of the two sides is to us just some "old 
wine in a new bottle," for it is nothing but the same MC = MR condition in a different guise! 
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Present Value of a Perpetual Flow 

If a cash flow were to persist forever—a situation exemplified by the interest from a per¬ 
petual bond or the revenue from an indestructible capital asset such as land -the present 
value of the flow would be 


n = f R(i)e~ n dt 

h 


which is an improper integral. 


Example 8 


Find the present value of a perpetual income stream flowing at the uniform rate of D dol¬ 
lars per year, if the continuous rate of discount is r. Since, in evaluating an improper inte¬ 
gral, we simply take the limit of a proper integral, the result in (1-4.12) can still be of help. 
Specifically, we can write 


n = 


De r| dt 


lim ( De rt dt= lim —(1 - e fy ) 
y-'^’./o y r 


D 

r 


Note that the y parameter (number of years) has disappeared from the final answer. This 
is as it should be, for here we are dealing with a perpetual flow. You may also observe That 
our result (present value = rate of revenue flow -i- rate of discount) corresponds precisely to 
the familiar formula for the so-called capitalization of an asset with a perpetual yield. 


EXERCISE 14.5 


1. Given the following marginal-revenue functions: 

(o) fi'(O) = 28Q- e 0iQ (/>) fl'(Q) = 10(1 + OK 2 

find in each case the total-revenue function fl(Q). What initial condition can you 
introduce to definite the constant of integration? 

2. (o) Given the marginal propensity to import M'(Y) = 0.1 and the information that 

M = 20 when Y = 0, find the import function M(Y). 

(t>) Given the marginal propensity to consume C'(Y) = 0.8-t-0.1Y' ,/2 and the 
information that C = Y when Y = 100, find the consumption function C(Y). 

3. Assume that the rate of investment is described by the function I (t) = 12t' and that 
K( 0) = 25: 

(a) Find the time path of capital stock K. 

(b) Find the amount of capital accumulation during the time intervals [0,1] and [1, 3], 
respectively. 

4. Given a continuous income stream at the constant rate of 51,000 per year: 

(a) What will be the present value n if the income stream lasts for 2 years and the 
continuous discount rate is 0.05 per year? 

(b) What will be the present value n if the income stream terminates after exactly 
3 years and the discount rate is 0.04? 

5. What is the present value of a perpetual cash flow of: 

(a) $1,450 per year, discounted atr = 5%? 

(b) $2,460 per year, discounted at r = 8%? 
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14.6 Domar Growth Model 


In the population-growth problem of (14.1) and (14.2) and the capital-formation problem 
of (14.10), the common objective is to delineate a time path on the basis of some given pat¬ 
tern of change of a variable. In the classic growth model of Fhofessor Domar. 1 on the other 
hand, the idea is to stipulate the type of time path required to prevail if a certain equilibrium 
condition of the economy is to be satisfied. 


The Framework 

The basic premises of the Domar model are as follows: 


1. Any change in the rate of investment flow per year /(r) will produce a dual effect: it will 
affect the aggregate demand as well as the productive capacity of the economy. 

2. The demand effect of a change in/(/) operates through the multiplier process, assumed 
to work instantaneously. Thus an increase in !{t) will raise the rate of income flow per 
year Y(t) by a multiple of the increment in lit). The multiplier is k = l/.v, where ,s- 
stands for the given (constant) marginal propensity to save. On the assumption that l{t) 
is the only (parametric) expenditure flow that influences the rate of income flow, we can 
then state that 


dY _dl 1 
dt dr s 


(14.13) 


3. The capacity effect of investment is to be measured by the change in the rate of poten¬ 
tial output the economy is capable of producing. Assuming a constant capacity-capital 
ratio, we can write 

K 

— = p (— a constant) 


where k (the Greek letter kappa) stands for capacity or potential output flow per year, 
and p (the Greek letter rho) denotes the given capacity-capilal ratio. This implies, of 
course, that with a capital slock K(i) the economy is potentially capable of producing 
an annual product, or income, amounting to k = pK dollars. Noie thai. from k = pK 
(the production function), it follows that dre = p dK, and 

dx dK 

04 . 14 ) 


In Domaris model, equilibrium is defined to be a situation in which productive capacity 
is fully utilized. To have equilibrium is, therefore, to require the aggregate demand to be 
exactly equal to the potential output producible in a year; that is, Y = tc. If we start initially 
from an equilibrium situation, however, the requirement will reduce to the balancing of the 
respective changes in capacity and in aggregate demand; that is, 


dY (Jk 
dt dt 


(14.15) 


T Evsey D. Domar. "Capital Expansion, Rate of Growth, and Employment/' Econometrica, April 1946, 
pp. 137-147; reprinted in Domar, Essays in the Theory of Economic Growth, Oxford University Press, 
Fair Lawn, N.J., 1957, pp. 70-82. 
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What kind of time path of investment /(f) can satisfy this equilibrium condition at al! 
times? 


Finding the Solution 

To answer this question, we first substitute (14.13) and (14.14) into the equilibrium condi¬ 
tion (14.15). The result is the following differentia! equation: 


dl_ 1_ 

Jt 


pl 


or 


1 dl 

1 ~dt 


= PS 


(14.16) 


Since (14.16) specifies a definite pattern of change for /, wc should be able to find the equi¬ 
librium (or required) investment path from it. 

In this simple case, the solution is obtainable by directly integrating both sides of the 
second equation in (14.16) with respect to t. The fact that the two sides arc identical in equi¬ 
librium assures the equality of their integrals. Thus, 

f 1 dl , f 

/- dt=fpxdt 

Id! 


By the substitution rule and the log rule, the left side gives us 


dl i 
— = In 
/ 


c l 


(/# 0 ) 


whereas the right side yields {pa being a constant) 


pa dt 


pst + c 2 


Equating the two results and combining the two constants, we have 

In |/| = pst + c (14.17) 

To obtain |/| from In |/], we perform an operation known as “taking the antilog of In [/],” 
which utilizes the fact that e 1 "' = x. Thus, letting each side of (14.17) become the exponent 
of the constant e, we obtain 

e ln|/| _ c ,iffSl If) 

or |/| = e‘ ,st e L = Ae fj5 ’ where A = e c 

If we take investment to be positive, then |/| = /, so that the preceding result becomes 
/(f) - , where A is arbitrary. To get rid of this arbitrary constant, we set t = 0 in the 

equation /(f) = Ae psl , to get/(0) = Ae° - A . This definitizes the constant A, and enables 
us to express the solution- the required investment path—as 

/(f) = l(0)e^ (14.18) 

where / (0) denotes the initial rate of investment? 

This result has a disquieting economic meaning. In order to maintain the balance 
between capacity and demand over time, the rate of investment flow must grow precisely 

f The solution (14.18) will remain valid even if we let investment be negative in the result |/| = 

See Exercise 14.6-3. 
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FIGURE 14,6 



at the exponential rate of ps, along a path such as illustrated in Fig. 14.6. Obviously, the 
larger the capacity-capital ratio or the marginal propensity to save, the larger the required 
rate of growth will be. But at any rate, once the values of/) and.v arc known, the required 
growth path of investment becomes very rigidly set. 


The Razor's Edge 

It now becomes relevant to ask what will happen if the actual rate of growth ofinvestment— 
call that rate r —differs from the required rate ps. 

Domar’s approach is to define a coefficient of utilization 



m 

K(t) 


[a = 1 means full utilization of capacity] 


and show that u = r/ps , so that u | I as r ^ ps. In other words, if there is a discrepancy 
between the actual and required rates (r ± ps), we will find in the end (as t -* oc) either 
a shortage of capacity (» > 1) or a surplus of capacity (w < 1), depending on whether r is 
greater or less than ps. 

Wc can show, however, that the conclusion about capacity shortage and surplus really 
applies at any time t. not only as t -*■ oo. For a given growth rate r implies that 

/(/) = /( Qy and — =r i(oy' 

at 

Therefore, by (14.13) and (14.14), we have 


dY 

dt 


1 d J_ 

s dt 


= -l(0)e r ' 

s 


— = pi(t) = pi{ oy 
dt 


The ratio between these two derivatives, 


dY/dt _ r 
dic/dt ps 

should tell us the relative magnitudes of the demand-creating effect and the capacity¬ 
generating effect of investment at any time t. under the actual growth rale of r. If r (the 
actual rate) exceeds ps (the required rate), then dY/dt > dnjdi. and the demand effect 
will outstrip the capacity effect, causing a shortage of capacity. Conversely, if r < ps , there 
will be a deficiency in aggregate demand and, hence, a surplus of capacity. 
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The curious thing about this conclusion is that if investment actually grows at a faster 
rate than required (r > pa), the end result will be a shortage rather than a surplus of 
capacity. It is equally eurious that if the actual growth of investment lags behind the 
required rate {r < pi), we will encounter a capacity surplus rather than a shortage. Indeed, 
because of such paradoxical results, if we now allow the entrepreneurs to adjust the actual 
growth rate r (hitherto taken to be a constant) according to the prevailing capacity situation, 
they will most certainly make the “wrong" kind of adjustment. In the case ofr > /«, for 
instance, the emergent capacity shortage will motivate an even faster rate of investment. 
But this would mean an increase in r, instead of the reduction called for under the circum¬ 
stances. Consequently, the discrepancy between the two rates of growth would be intensi¬ 
fied rather than reduced. 

The upshot is that, given the parametric constants p and s, the only way to avoid both 
shortage and surplus of productive capacity is to guide the investment flow ever so care¬ 
fully along the equilibrium path with a growth rate r* = p.v. And, as we have shown, 
any deviation from such a “razor's edge’’ time path will bring about a persistent failure to 
satisfy the norm of full utilization which Domar envisaged in this model. This is perhaps 
not too cheerful a prospect to contemplate. Fortunately, more flexible results become pos¬ 
sible when certain assumptions of the Domar model are modified, as we shall see from the 
growth model of Professor Solow, to be discussed in Chap, 15. 


EXERCISE 14.6 

1. How many factors of production are explicitly considered in the Domar model? What 
does this fact imply with regard to the capital-labor ratio in production? 

2. We learned in Sec. 10.2 that the constant r in the exponential function Ae n represents 
the rate of growth of the function. Apply this to (14.1 6), and deduce (14.18) without 
going through integration, 

3. Show that even if we let investment be negative in the equation |/| = Ae nU , upon 
definitl 2 ing the arbitrary constant A we will still end up with the solution (14.18). 

4. Show that the result in (14.18) can be obtained alternatively by finding—and 
equating—the definite integrals of both sides of (14.16), 

1 dl 

1 

with respect to the variable t f with limits of integration t - 0 and t - t. Remember that 
when we change the variable of integration from t to /, the limits of integration will 
change from t = 0 and t = t, respectively, to I = I (0) and I = I (f). 



Chapter 


Continuous Time: 
First-Order Differential 
Equations 



In the Domar growth model, we have solved a simple differential equation by direct inte¬ 
gration. For more complicated differential equations, there are various established methods 
of solution. Kven in the latter cases, howcvcT, the fundamental idea underlying the methods 
of solution is still the techniques of integral calculus. For this reason, the solution to a 
differential equation is often referred to as the integral of that equation. 

Only first-order differential equations will be discussed in the present chapter. In this 
context, the word order refers to the highest order of the derivatives (or differentials) 
appearing in the differential equation; thus a lirst-ordcr differential equation can contain 
only the first derivative, say, dy/dl. 

15.1 First-Order Linear Differential Equations with Constant 
Coefficient and Constant Term 


The first derivative dy/dt is the only one that can appear in a first-order differential equa¬ 
tion, but it may enter in various powers: dy/dt , (dy/dt) 2 , or (dy/dl) 7 '. The highest power 
attained by the derivative in the equation is referred to as the degree of the differential 
equation. In case the derivative dy/dt appears only in the first degree, and so does the 
dependent variable y, and furthermore, no product of the form y(dy/df) occurs, then the 
equation is said to be linear. Thus a lirst-ordcr linear differential equation will generally 
take the form 1 

tly 

~+u(i)y = w(f ) (15.1) 

1 Note that the derivative term dy/dt in (15.1) has a unit coefficient. This is not to imply that it can 
never actually have a coefficient other than one, but when such a coefficient appears, we can always 
“normalize" the equation by dividing each term by the said coefficient. For this reason, the form 
given in (15.1) may nonetheless be regarded as a general representation. 
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where u and w are two functions of r, as is y, In contrast to dy/di and v, however, no 
restriction whatsoever is placed on the independent variable t. Thus the functions u and w 
may very well represent such expressions as t 2 and e' or some more complicated functions 
of t; on the other hand, u and w may also be constants. 

This last point leads us to a further classification. When the function u (the coefficient of 
the dependent variable y) is a constant, and when the function w is a constant additive term, 
(15.1) reduces to the special case of a first-order linear differential equation with constant 
coefficient and constant term. In this section, we shall deal only with this simple variety of 
differential equations. 


The Homogeneous Case 

if u and w are constant functions and if w happens to be identically zero, (15.1) will become 

~ +ay = 0 (15.2) 


where a is some constant. This differential equation is said to be homogeneous on account 
of the zero constant term (compare with homogeneous-equation systems). The defining 
characteristic of a homogeneous equation is that when all the variables (here, dy/dt and y) 
are multiplied by a given constant, the equation remains valid. This characteristic holds if 
the constant term is zero, but will be lost if the constant term is not zero. 

Equation (15.2) can be written alternatively as 


1 dy 
v dt 


(15.2') 


But you will recognize thatthe differential equation (14.16) we met in the Domar model is 
precisely of this form. Therefore, by analogy, we should be able to write the solution of 
(15.2) or {15.2') immediately as follows: 

y{t) = Ae~‘" [genera! solution] (15.3) 

or y(t) = y(0)<?' a/ [definite solution] (15.3') 

In (15.3), there appears an arbitrary constant/!: therefore it \s a general solution. When any 
particular value is substituted for iff the solution becomes a particular solution of (15.2). 
There is an infinite number of particular solutions, one for each possible value of A, in¬ 
cluding the value y(0). This latter value, however, has a special significance: y(0) is the 
only value that can make the solution satisfy the initial condition. Since this represents the 
result of definitizing the arbitrary constant, we shall refer to (15.3') as the definite solution 
of the differential equation (15.2) or (15.2'). 

You should observe two things about the solution of a differential equation: (1) the solu¬ 
tion is not a numerical value, but rather a function y(t) a time path if / symbolizes time; and 
(2) the solution v(t) is free of any derivative or differential expressions, so that as soon as a 
specific value oft is substituted into it, a corresponding value of y can be calculated directly. 


The Nonhomogeneous Case 

When a nonzero constant takes the place of the zero in (15.2), we have a nonhomogeneous 
linear differential equation 


dy 

7 +ay = 


b 


(15.4) 


I 
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The solution of this equation will consist of the sum of two terms, one of which is called 
the complementary function (which we shall denote by yj, and the other known as the 
particular integral (to be denoted by y p ). As will be shown, each of these has a significant 
economic interpretation. Here, we shall present only the method of solution; its rationale 
will become clear later. 

Even though our objective is to solve the «o«homogeneous equation (15.4), frequently 
we shall have to refer to its homogeneous version, as shown in (15.2). For convenient ref¬ 
erence, we call the latter the reduced equation of (15.4). The nonhomogcncous equation 
(15.4) itself can accordingly be referred to as the complete equation. It turns out that the 
complementary function y c is nothing but the general solution of the reduced equation, 
whereas the particular integral y p is simply any particular solution of the complete 
equation. 

Our discussion of the homogeneous case has already given us the general solution oflhc 
reduced equation, and we may therefore write 

y, = Ae~ m [by (15.3)] 

What about the particular integral? Since the particular integral is any particular solution 
of the complete equation, we can first try the simplest possible type of solution, namely, y 
being some constant (y = k). Ify is a constant, then it follows that dy/dt = 0. and (15,4) 
will become ay ~ b, with the solution y = b/a. Therefore, the constant solution will work 
as long as a 0. In that case, we have 

» = ^ (“ 7^ 0) 

The sum of the complementary function and the particular integral then constitutes the 
general solution of the complete equation (15.4); 

y(t) = y ( -+ y P = Ae~ ul + - [general solution, case of a # 0] (15.5) 

What makes this a general solution is the presence of the arbitrary constant/!. We may. 
of course, definitize this constant by means of an initial condition. Let us say that y takes 
the value >(0) when i = O.Then, by setting / - 0 in (15.5), we find that 

v(0) = A + - and A = v(0) — 
a a 

Thus we can rewrite (15.5) into 

>■(/)= 4 '(d)— c~“' + - [definite solution, case ofu # 0] (15.5') 
a J a 

It should be noted that the use of the initial condition to definitive the arbitrary constant 
is—and should be—undertaken as the final step, after we have found the general solution 
to the complete equation. Since the values of both y c and y p are related to the value of j-fO). 
both of these must be taken into account in definitizing the constant .f 

Example 1 So ' ve ec l ual:i on dy/dt + 2y = 6, with the initial condition y(0) = 10. Here, we have 
- a = 2 and b = 6; thus, by (15.5'), the solution is 

y(t) = (10-3)e 2( + 3 = 7e 2, + 3 
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Example 2 


Example 3 


')vnamic Analysis 


Solve the equation dy/dt + Ay = 0, with the initial condition y(0) = 1. Since a = 4 and 
b = 0, we have 

j<f) = (1 - 0)e~ At -i-0 = e 4t 

The same answer could have been obtained from (15.3'), the formula for the homogeneous 
case. The homogeneous equation (15.2) is merely a special case of the nonhomogeneous 
equation (15.4) when b = 0. Consequently, the formula (15.3') is also a special case of for¬ 
mula (15.5') under the circumstance that b = 0. 


What if a = 0, so that the solution in (15.5') is undefined? In that case, the differential 
equation is of the extremely simple form 



(15.6) 


By straight integration, its general solution can be readily found to be 

y(t)=bt + c (15.7) 

where c is an arbitrary constant, The two component terms in (15.7) can, in fact, again be 
identified as the complementary function and the particular integral of the given differen¬ 
tial equation, respectively. Since a = 0, the complementary function can be expressed 
simply as 

y c - Ae~“' - Ae° = A (A = an arbitrary constant) 

As to the particular integral, the fact that the constant solution y = k fails to work in the 
present case of a = 0 suggests that we should try instead a nonconstant solution. Let us 
consider the simplest possible type of the latter, namely, y = hi. Ify = kt, then dy/dt = k. 
and the complete equation (15.6) will reduce to k - b, so that we may write 

y P = bt (a = 0 ) 

Our new trial solution indeed works! The general solution of (15.6) is therefore 

y(t) = y ( . + y = a + bt [general solution, case of a = 0] (15.7') 

which is identical with the result in (15.7), because c and A arc but alternative notations for 
an arbitrary constant. Note, however, that in the present case, >y is a constant whereas y p is 
a function of time—the exact opposite of the situation in (15.5). 

By definitizing the arbitrary constant, we find the definite solution to be 

y(t) = y(0) + bt [definite solution, case of a = 0] (15.7") 


Solve the equation dy/dt = 2, with the initial condition y(0) = 5. The solution is, by 
(15.7"), 

y(t) = 5 + 2t 


Verification of the Solution 

It is true of all solutions of differential equations that their validity can always be checked 
by differentiation. 
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If we try that on the solution (15.5'), we can obtain the derivative 


dy 

dt 


= —ii 


v(0) 



When this expression for dy/dt and the expression for v(t) as shown in (15.5') are substi¬ 
tuted into the left side of the differential equation (15.4). that side should reduce exactly 
to the value of the constant term h on the right side of (15.4) if the solution is correct. 
Performing this substitution, we indeed find that 



b 

e~" + a 

b 

/) 

■a 

WO) - - 

WO) - - 

r 1 " + - 


L 


L "j 

a 


Thus our solution is correct, provided it also satisfies the initial condition. To check the 
latter, let us set t = 0 in the solution (15.5'). Since the result 


WO) = 


m- b - 

a 


b 

- = >*(()) 
a 


is an identity, the initial condition is indeed satisfied. 

It is recommended that, as a final step in the process of solving a differential equation, 
you make it a habit to check the validity of your answer by making sure (I) that the deriv¬ 
ative of the time path y(0 is consistent with the given differential equation and (2) that the 
definite solution satisfies the initial condition. 


EXERCISE 15.1 

1. Find Yo Yp, the general solution, and the definite solution, given: 

(0) “ 12; WO) = 2 (C) ^ + lOy = 15; WO) - 0 

(fa) ~ j - 2y = 0; WO) - 9 (d) 2^ + 4y = 6; y(0) = 1} 

2. Check the validity of your answers to Prob. 1. 

3. Find the solution of each of the following by using an appropriate formula developed 


in the text: 


(o)f t + y = 4;W°) = 0 

<rf) j t + = 2; WO) - 4 

W ^ = 23; WO) = 1 

W ^- ? y = 7;W0)-7 

W ~ - 5y - 0; y(0) - 6 

(0 3^ + 6y = 5;W0) = 0 


4. Check the validity of your answers to Prob. 3. 


15.2 Dynamics of Market Price 


In the (macro) Domar growth model, we found an application of the homogeneous ease of 
linear differential equations of the first order, To illustrate the nonhomogeneous case, lei us 
present a (micro) dynamic model of the market, 
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The Framework 

Suppose that, for a particular commodity, the demand and supply functions are as follows; 


Q a =a-fiP (M> 0) 

Q t = -y + SP (yj>0) 

Then, according to (3.4), the equilibrium price should be f 

a + y 


P‘ = 




(= some positive constant) 


(15.8) 


(15.9) 


If it happens that the initial price P( 0) is precisely at the level of P’. the market will clearly 
be in equilibrium already, and no dynamic analysis will be needed. In the more interesting 
case of P( 0) # P*, however, P* is attainable (if ever) only after a due process of adjust¬ 
ment, during which not only will price change over time but Qj and Q s . being functions of 
P, must change over time as well. In this light, then, the price and quantity variables can all 
be taken to be functions of time. 

Our dynamic question is this: Given sufficient time for the adjustment process to work 
itself out, docs it lend to bring price to the equilibrium level P“? That is, does the lime path 
P(t) tend to converge to P\ as t ->■ oc'! 


The Time Path 

To answer this question, we must first find the time path P(i). Hut that, in turn, requires a 
specific pattern of price change to be prescribed first. In general, price changes arc gov¬ 
erned by the relative strength of the demand and supply forces in the market. Let us assume, 
for the sake of simplicity, that the rate of price change (with respect to time) at any moment 
is always directly proportional to the ctoev? demand ( Qj - Q s ) prevailing at that moment. 
Such a pattern of change can be expressed symbolically as 

d f=m-Q.) u>»> (i5.io) 

dt 

where j represents a (constant) adjustment coefficient With this pattern ot change, wo can 
have dP/df = 0 if and only if Qj = Q s . In this connection, it may be instructive to note 
two senses of the term equilibrium price: the intertemporal sense (P being constant over 
time) and the market-clearing sense (the equilibrium price being one that equates Qj and 
Q ,). in the present model, the two senses happen to coincide with each other, but this may 
not be true of all models. 

By virtue of the demand and supply functions in (15.8). we can express (15.10) specifi¬ 
cally in the form 

dP 

= j( a ~f}p + y - HP) = j(a + y) - j{P + S)P 
at 

or 

^+j(P + 8)P=j(a + y) (15.10') 

at 


t We have switched from the symbols (a b, c, d) of (3.4) to (or, 0, y, *') here to avoid any possible 
confusion with the use of a and b as parameters in the differential equation (15.4) which we shall 
presently apply to the market model. 


I 
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FIGURE 15.1 


Since this is precisely in the form of the differential equation (15.4), and since the coeffi¬ 
cient qCP is nonzero, we can apply the solution formula (15.5') and write the solution—the 
time path of price—as 




P{ 0) 


a + y 

J+S 


e -HP+s)t 


a + y 

JTs 


= [P(0) - />*]*"*' + p* [by (15.9): k= j {?+$)] 


(15.11) 


The Dynamic Stability of Equilibrium 

In the end, the question originally posed, namely, whether P(t) ->• P‘ as / -* oo, amounts 
to the question of whether the first term on the right of (15.11) will tend to zero as t -* oo. 
Since P(0) and P K are both constant, the key factor will be the exponential expression 
e~ kl . In view of the fact that k > 0, that expression does tend to zero as t ->• oo. Conse¬ 
quently. on the assumptions of our model, the time path will indeed lead the price toward 
the equilibrium position. In a situation of this sort, where the time path of the relevant vari¬ 
able Pit) converges to the level P" interpreted here in its role as the intertemporal (rather 
than market-clearing) equilibrium—the equilibrium is said to be dynamically stable. 

The concept of dynamic stability is an important one. Let us examine it further by a 
more detailed analysis of (15.11). Depending on the relative magnitudes of P(0) and P\ 
the solution (15.11) really encompasses three possible cases. The first is P( 0) = P r , which 
implies P(t) = P*. In that event, the time path of price can be drawn as the horizontal 
straight line in Fig. 15.1. As mentioned earlier, the attainment of equilibrium is in this case 
a fait accompli. Second, wc may have P( 0) > P\ In this case, the first term on the right of 
( 15.11) is positive, but it will decrease as the increase in t lowers the value of e kl . Thus the 
time path will approach the equilibrium level P + from above, as illustrated by the top curve 
in Fig. 15.1. Third, in the opposite case of P(0) < P ", the equilibrium level P* will be 
approached from below, as illustrated by the bottom curve in the same tigurc. In general, 
to have dynamic stability, the deviation of the time path from equilibrium must either be 
identically zero (as in case 1) or steadily decrease with time (as in cases 2 and 3). 

A comparison of (15.11) with (15.5') tells us that the P* term, the counterpart of b/a, 
is nothing but the particular integral », whereas the exponential term is the (definitized) 
complementary function jy. Thus, wc now' have an economic interpretation lor y L and 
y p \y p represents the intertemporal equilibrium level of the relevant variable, and y c is the 
deviation from equilibrium. Dynamic stability requires the asymptotic vanishing of the 
complementary function as t becomes infinite, 
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In this model, the particular integral is a constant, so we have a stationary equilibrium 
in the intertemporal sense, represented by P*. If the particular integral is nonconstant, as in 
(15.7'), on the other hand, we may interpret it as a moving equilibrium. 

An Alternative Use of the Model 

What we have done in the preceding is to analyze the dynamic stability of equilibrium (the 
convergence of the time path), given certain sign specifications lor the parameters. An al¬ 
ternative type of inquiry is: In order to ensure dynamic stability, what specific restrictions 
must be imposed upon the parameters? 

The answer to that is contained in the solution (15.11). If we allow P(0) # P*, we see 
that the first (y c ) term in (15.11) will tend to zero as t -> oo if and only i(k > 0—that is, 
if and only if 

Thus, we can take this last inequality as the required restriction on the parameters j (the ad¬ 
justment coefficient of price), ft (the negative of the slope of the demand curve, plotted with 
Q on the vertical axis), and 8 (the slope of the supply curve, plotted similarly). 

In case the price adjustment is of the “normal” type, with j > 0, so that excess demand 
drives price up rather than down, then this restriction becomes merely (ft + 8) > 0 or, 
equivalently. 

ft >-ft 

To have dynamic stability in that event, the slope of the supply must exceed the slope of the 
demand. When both demand and supply arc normally sloped (-ft < 0, <5 > 0), as in 
(15.8), this requirement is obviously met. But even if one of the curves is sloped 
“perversely,” the condition may still be fulfilled, such as when 8 = 1 and -ft = \/2 {posi¬ 
tively sloped demand). The latter situation is illustrated in Fig. 15.2, where the equilibrium 
price P* is, as usual, determined by the point of intersection of the two curves. If the initial 
price happens to be at Pi, then Q a (distance Pi O’) will exceed Q, (distance Pi P), and the 
excess demand (FG) will drive price up. On the other hand, if price is initially at P 2 , then 


FIGURE 15.2 



i 
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there will bo a negative excess demand MN, which will drive the price down. As the two ar¬ 
rows in the figure show, therefore, the price adjustment in this case will be towWthe equi¬ 
librium. no matter which side of P' we start from. We should emphasize, however, that 
while these arrows can display the direction, ihey are incapable of indicating the magnitude 
of change. Thus Fig. 15.2 is basically static, not dynamic, in nature, and can serve only to 
illustrate, not to replace, the dynamic analysis presumed. 


EXERCISE 15.2 

1. If both the demand and supply in Fig. 15.2 are negatively sloped instead, which curve 
should be steeper in order to have dynamic stability? Does your answer conform to the 
criterion 5 > -ft? 

2. Show that {15.10') can be rewritten as dP/dt+ k(P - P*) = 0. If we let P - P' = A 

(signifying deviation), so that dA/dt = dP/dt, the differential equation can be further 

rewritten as 

dA . A 
- + tfl = 0 

Find the time path A(t), and discuss the condition for dynamic stability. 

3. The dynamic market model discussed in this section is closely patterned after the static 
one in Sec. 3.2. What specific new feature is responsible for transforming the static 
model into a dynamic one? 

4. Let the demand and supply be 

dP 

Qd-a-fiP-{o— Q s = -yibP (a,p,yj> 0) 

(o) Assuming that the rate of change of price over time is directly proportional to the 
excess demand, find the time path P(f) (general solution). 

(b) What is the intertemporal equilibrium price? What is the market-clearing equilib¬ 
rium price? 

(c) What restriction on the parameter a would ensure dynamic stability? 

5. Let the demand and supply be 

dP 

Qa-a-fiP-n-^ Qs=<$P («, /i, >), h > 0) 

(o) Assuming that the market is cleared at every point of time, find the time path P(t) 
(general solution). 

(b) Does this market have a dynamically stable intertemporal equilibrium price? 

(c) The assumption of the present model that = Q s for ail t is identical with that of 
the static market model in Sec. 3.2. Nevertheless, we still have a dynamic mode! 
here. How come? 


15.3 Variable Coefficient and Variable Term 


In the more general case of a first-order linear differential equation 

civ 

— - «(f)v = w(o 

dt 


(15.12) 
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u(t) and iv(7) represent a variable coefficient and a variable term, respectively. How do we 
find the time path y(r) in this case? 

The Homogeneous Case 

For the homogeneous case, where w(0 — 0, the solution is still easy to obtain. Since the 
differential equation is in the form 

^f + u(t)y = 0 or - -j- = -u{t) (15.13) 

at y at 

we have, by integrating both sides in turn with respect to t, 

Left side = ( - — dt = f — = \nv + c (assuming v > 0) 

J y dt J y 

Right side = ( -u(t) dt = - (u{t) dt 


[n the latter, the integration process cannot be carried further because u(t) has not been 
given a specific form; thus we have to settle for just a general integral expression. When the 
two sides are equated, the result is 


lny = -c - j u(t) dt 


Example 1 


Then the desired y path can be obtained by taking the antilog of lny: 

y(t) = e iny =e' c e~ j u<luh = Ae~f ui,) ‘ l! where A= e~ c (15.14) 

This is the general solution of the differential equation (15.13). 

To highlight the variable nature of the coefficient u(t), we have so far explicitly written 
out the argument t. For notational simplicity, however, we shall from here on omit the 
argument and shorten u(t) to u. 

As compared with the general solution (15.3) for the constant-coefficient case, the only 
modification in (15.14) is the replacement of the e~ al expression by the more complicated 

expression e f vdl . The rationale behind this change can be better understood if we inter¬ 
pret the at term in e~ ai as an integral: fa dt = at (plus a constant which can be absorbed 
into the A term, since e raised to a constant power is again a constant). In this light, the dif¬ 
ference between the two general solutions in fact turns into a similarity. For in both cases 
we are taking the coefficient of the v term in the differential equation—a constant term a in 
one case, and a variable term u in the other—and integrating that with respect to t, and then 
taking the negative of the resulting integral as the exponent of e. 

Once the general solution is obtained, it is a relatively simple matter to get the definite 
solution with the help of an appropriate initial condition. 

Find the general solution of the equation ^ + 3t 2 y = 0. Here we have u= 3t 2 , and 

fudt = f3t 2 dt= t 3 + c. Therefore, by (15.14), we may write the solution as 

y(t) = 4e _(!3 ' c) = Ae~‘\ c = Be ~' 3 where B s Ae~ c 

Observe that if we had omitted the constant of integration c, we would have lost no 
information, because then we would have obtained y(t) = Ae ~ l , which is really the identi¬ 
cal solution since A and fiboth represent arbitrary constants. In other words, the expression 
e -c , where the constant c makes its only appearance, can always be subsumed under the 
other constant A. 
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Example 2 


Example 3 


The Nonhomogeneous Case 

For the nonhomogeneous case, where w(r) ^ 0, the solution is not as easy to obtain. We 
shall try to find that solution via the concept of exact differential equations, to be discussed 
in Sec. 15.4. it does no harm, however, to state the result here first: Given the differential 
equation (15.12), the general solution is 

>•(/) = e~f ua ' (a + f wef uJl (15.15) 


where A is an arbitrary constant that can be deiinitized if we have an appropriate initial 
condition. 

It is of interest that this general solution, like the solution in the constant-coefficient 
constant-term case, again consists of two additive components, Furthermore, one of these 
two, Ae f udr , is nothing but the general solution of the reduced (homogeneous) equation, 
derived earlier in (15.14), and is therefore in the nature of a complementary function. 

(jy 

Find the general solution of the equation + 2ty = t. Here we have 

u = 2t w=t and judt = t 2 + k {k arbitrary) 

Thus, by (15.15), we have 

y(f) = <r (|2 ~ k) (A + [ te f2+ * dt] 


= e 


-F„-k 


4 + e* jte^dt 


= Ae k e ^ +e ^ Q/ +cj [rV = 1] 

= {Ae~ k + c)e-* + l 

- 8e~ f! + j where B = Ae~ k + c is arbitrary 


The validity of this solution can again be checked by differentiation. 

It is interesting to note that, in this example, we could again have omitted the constant 
of integration k, as well as the constant of integration c, without affecting the final outcome. 
This is because both k and c may be subsumed under the arbitrary constant B in the final 
solution. You are urged to try out the simpler process of applying (15.15) without using the 
constants k and c, and verify that the same solution wili emerge. 


Solve the equation 
Since 


dy 

dt 


+ 4fy=4t. This time we shall omit the constants of integration. 


u = 4t w = 4t and 


udt = 2t 2 [constant omitted] 


the general solution is, by (15.15), 


y(t) = A + Ute 2l2 dt) = e~ 2 ' 2 f 4 + e 2 ' 2 


= Ae 2,2 +1 


[constant omitted] 
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As may be expected, the omission of the constants of integration serves to simplify the pro¬ 
cedure substantially. 

The differential equation -j- -yvy—w in (15.12) is more general than the equation 
^ -|- ay = b in (15.4), since u and w are not necessarily constant, as are a and b. Accord¬ 


ingly, solution formula (15.15) is also more general than solution formula (15.5). In fact, 
when we set u= a and w = b, (15.15) should reduce to (15.5). This is indeed the case. For 
when we have 


u = a w = b and 
then (15.15) becomes 


udt = at 


y(t) = e~ al I be 11 ' 01 (a + ^ e 


oi 


= Ae 


-at 


which is identical with (15.5). 


[constant omitted] 


[constant omitted] 


EXERCISE 15.3 

Solve the following first-order linear differential equations; if an initial condition is given, 
definitize the arbitrary constant: 

2 -|+ 2 "'=° 

3- j t + 2ty - f ; KO) - \ 

4. ^ + t 2 y = 5t 2 ;X0) = 6 

5. 2^ + 12y + 2e'-0;y{0) = | 

,^y- t 

15.4 Exact Differential Equations __ 

We shall now introduce the concept of exact differential equations and use the solution 
method pertaining thereto to obtain the solution formula (15.15) previously cited for the dif¬ 
ferential equation (15.12). Even though our immediate purpose is to use it to solve a linear 
differential equation, an exact differential equation can be either linear or nonlinear by itself. 

Exact Differential Equations 

Given a function of two variables F{ v ,!), its total differential is 
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When this differential is set equal to 7ero, the resulting equation 

dF , <)F , 

—— dy H—— dl — 0 
d y dr 

is known as an exact differential equation, because its letl side is exactly the differential of 
the function F(y, t). For instance, given 

F(y, t ) = y 1 ! + k (/r a constant) 

the total differential is 


dF = lytdy + y 2 dt 

thus the differential equation 

2yt dy + y 2 dt = 0 or + — = 0 (15.16) 

dt 2vt 

is exact. 

In general, a differential equation 

Mdy + Ndt=Q (15.17) 

is exact if and only if there exists a function F{y. t) such that M = dF/dv and ;V - 
DF/dt. By Young's theorem, which stales that d 2 F/c)t dy = d l Fjdy dt. however, wc can 
also state that (15.17) is exact if and only if 


d M 9 /V 

dt dv 


(15.18) 


This last equation gives us a simple test lor the exactness of a differential equation. Applied 
to (15.16), where M = 2 yt and N = y 2 , this tost yields dM/dl = 2 y = dN/dy: thus the 
exactness of the said differential equation is duly verified. 

Note that no restrictions have been placed on the terms M and A with regard to the man¬ 
ner in which the variable y occurs. Thus an exact differential equation may very well be 
nonlinear (iny). Nevertheless, it will always be of the first order and the first degree. 

Being exact, the differential equation merely says 

dF(y, r)=0 

Thus its general solution should dearly be in the form 


F O', t) = c 

To solve an exact differential equation is basically, therefore, to search for the (primitive) 
function F(y, t) and then set it equal to an arbitrary constant. Let us outline a method of 
finding this for the equation M dv + N dt = 0. 


Method of Solution 

To begin with, since M = d Fjdy, the function F must contain the integral of M with re¬ 
spect to the variable y; hence we can write out a preliminary result—in a yet indeterminate 
form—as follows: 


F(y.t) = / Mdy + f(i) 


( 15 . 19 ) 
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Here M, a partial derivative, is to be integrated with respect to y only; that is, i is to be 
treated as a constant in the integration process, just as it was treated as a constant in the par¬ 
tial differentiation of/"( v, 0 that resulted in M = i) Fj'dyF Since, in differentiating F(y, i) 
partially with respect toy, any additive term containing only the variable t and/or some con¬ 
stants (but with no y) would drop out, we must now take care to reinstate such terms in the 
integration process. This explains why we have introduced in (15.19) a general term i lr{l). 
which, though not exactly the same as a constant of integration, has a precisely identical 
role to play as the latter. It is relatively easy to get jMdy\ but how do we pin down the 
exact form of this t lt{t) term? 

The trick is to utilize the fact that ,V = cl Fj'cH. But the procedure is best explained with 
the help of specific examples. 


Example 1 


Solve the exact differential equation 

2 yt dy + y 2 dt = 0 [reproduced from (15.16)] 
In this equation, we have 

M = 2yt and N = y 2 
Step i By (15.19), we can first write the preliminary result 

F(y, 0=/ 2yt dy H//(t) = y 2 t + t/HO 


Note that we have omitted the constant of integration, because it can automatically be 
merged into the expression f (f). 

Step ii If we differentiate the result from Step i partially with respect to t, we can obtain 

+ 0 

But since N = df/dt, we can equate N = y 2 and <) F/M = y 2 to get 

m = o 

Step iii Integration of the last result gives us 

H0 = j V/(0 dt = Jodt = k 

and now we have a specific form of iIt happens in the present case that i/(t) is simply 
a constant; more generally, it can be a nonconstant function of t . 

Step iv The results of Steps i and iii can be combined to yield 

F(y,t) = y 2 td-k 

The solution of the exact differential equation should then be F(y, t) = c. But since the con- 
stant k can be merged into c, we may write the solution simply as 

y 2 t = c or y(t) = cr v2 

where c is arbitrary. 


t Some writers employ the operator symbol /(•••) dyto emphasize that the Integration is with respect 
to y only. We shall still use the symbol /(■••) dy here, since there is little possibility of confusion. 
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Example 2 Solve the ec l uation (t + 2y)dy + (y+3t 2 )df = 0. First let us check whether this is an 

- exact differential equation. Setting M = t + 2y and N = y+ 3t 2 , we find that HM/Bt = 

1 = 3N/9y. Thus the equation passes the exactness test. To find its solution, we again 
follow the procedure outlined in Example 1. 

Step i Apply(15.19)andwrite 

F (y. 0 = j (f + 2 y) dy + f(t) - yt + y 2 -\- f(t) [constant merged into f (()] 

Step ii Differentiate this result with respect to t, to get 

a F , , 

Then, equating this to N = y + 3t 2 , we find that 

m = 3 1 2 


Step iii Integrate this last result to get 

fit) = j it 1 dt = f 3 [constant may be omitted] 

Step iv Combine the results of Steps i and iii to get the complete form of the function 

F(y,t)=yt+y 2 + t l 

which implies that the solution of the given differential equation is 

yf+y 2 + f 3 = c 

You should verify that setting the total differential of this equation equal to zero will indeed 
produce the given differential equation. 


This four-step procedure can be used to solve any exact differentia! equation. Interest¬ 
ingly, it may even be applicable when the given equation is not exact, To see this, however, 
we must first introduce the concept of integrating factor. 


Integrating Factor 

Sometimes an inexact differential equation can be made exact by multiplying every term of 
the equation by a particular common factor. Such a factor is called an integrating factor. 


Example 3 


The differential equation 


2f dy+yd t = 0 

is not exact, because it does not satisfy (15.18): 


3M 


= ^(20 = 2 , 


3N 

dy 


a 

9y 


(y) = i 


However, if we multiply each term by y, the given equation will turn into (15.16), which has 
been established to be exact. Thus y is an integrating factor for the differential equation in 
the present example. 
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When an integrating factor can he found for an inexact differential equation, il is always 
possible to render it exact, anti then the four-step solution procedure can be readily put 
to use. 

Solution of First-Order Linear Differential Equations 

The general first-order linear differential equation 

dv 

— +uy = w 
at 

which, in the format of (15.17), can be expressed as 

dy + (uy - w) dt = 0 (15.20) 

has the integrating factor 

<?/*"" s exp f [u dt) 


This integrating factor, whose form is by no means intuitively obvious, can be “discov¬ 
ered” as follows. Let 1 be the (yet unknown) integrating factor. Multiplication of (15.20) 
through by / should convert it into an exact differential equation 


7 dy + I(uy - iv) dt = 0 
■V v 


(15.20') 


The exactness test dictates that c hYt/Bi — fi.V/f)y. Visual inspection of the M and N 
expressions suggests that, since M consists of / only, and since u and w are functions of/ 
alone, the exactness test will reduce to a very simple condition if / is also a function of 
/alone. For then the test BM/Bi = BN/By becomes 

dl dl/dt 

— = lu or - - ii 

dt I 


Thus the special form / = /(/) can indeed work, provided it has a rate of growth equal to 
u, or more explicitly, «(/). Accordingly, /(/) should lake the specific form 

l(t) = AeJ u,/ ' [cf. (15.13) and (15.14)] 

As can be easily verified, however, the constant A can be set equal to 1 without affecting the 
ability of /(/) to meet the exactness test. Thus we can use the simpler form <?/"* as the 
integrating factor. 

Substitution of this integrating factor into (15.20') yields the exact differential equation 

eH dy + ef*%y - w) dt = 0 (15.20") 


which can then be solved by the four-step procedure. 


Step i First, we apply (15.19) to obtain 

F(y, t)=leJ ud ' dy+f{0 = yef udl + ^(0 


The result of integration emerges in this simple form because the integrand is independent 
of the variable y. 
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Step ii Next, we differentiate the result from Step i with respect to t, to get 

— = yu?F at + V>'(0 [chain rule] 
dt 

And, since this can be equated to N = ef ucll (uy- w), we have 

tf'(f) = -wef udl 

STEP iii Straight integration now yields 

1r{t) = -jwef udt dt 

Inasmuch as the functions u = u(t) and w = w(t) have not been given specific forms, noth¬ 
ing further can be done about this integral, and we must be contented with this rather 
general expression for V(0- 

Step iv Substituting this f(f) expression into the result of Step i, we find that 

F(y,t) = yJ udt -fwef ai “dt 

So the general solution of the exact differential equation (15.20")—and of the equivalent, 
though inexact first-order linear differential equation (15.20)—is 

yef udt — j wef udt dt = c 

Upon rearrangement and substitution of the (arbitrary constant) symbol cby A, this can be 
written as 

y(t)^ e -J ud ‘^A + jwef^dtj (15.21) 

which is exactly the result given earlier in (15.15). 


EXERCISE 15.4 


1. Verify that each of the following differential equations is exact, and solve by the 
four-step procedure: 

(a) 2yt ? dy + dt = 0 
(ift) l^tdy + iy* +2t ) eft — 0 
(c) t(\+2y)dy+y{\+y)dt = 0 


... dy 2y 4 t + 3f 2 A 

<d) i +J v^=° 


[Hint: First convert to the form of (15.17).] 


2. Are the following differential equations exact? If not, try t, y, and y 2 as possible 
integrating factors. 

(a) 2(f 3 +l)dy+3yr 2 dt = 0 
(ft) 4y 3 t dy+(2y 4 -3f) df = 0 


3. By applying the four-step procedure to the general exact differential equation 
M dy+ N dt = 0, derive the following formula for the general solution of an exact 
differential equation: 


d 
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15.5 Nonlinear Differential Equations 

of the First Order and First Degree __ 

In a /incur differential equation, we restrict to the /kri degree not only the derivative dy/dt, 
but also the dependent variable v, and we do not allow the product y(dv/dt) ty appear. 
When y appears in a power higher than one, the equation becomes nonlinear even if it only 
contains the derivative dy/dt in the first degree. In general, an equation in the form 

f(y, 0 dy+g(y, t)dt = 0 (15.22) 

or 

~=k(v,t) (15.22') 

at 

where there is no restriction on the powers ofy and /, constitutes a first-order first-degree 
nonlinear differential equation because dy/dt is a first-order derivative in the first power. 
Certain varieties of such equations can be solved with relative ease by more or less routine 
procedures. Wc shall briefly discuss three cases. 

Exact Differential Equations 

The first is the now-familiar case of exact differential equations. As was pointed out earlier, 
they variable can appear in an exact equation in a high power, as in (15,16) 2yt dy 4 
y 2 dt = 0—which you should compare with (15.22). True, the cancellation of the common 
factor y from both terms on the left will reduce the equation to a linear form, but the exact¬ 
ness property will be lost in that event. As an exact differential equation, therefore, it must 
be regarded as nonlinear. 

Since the solution method for exact differential equations has already been discussed, 
no further comment is necessary here. 

Separable Variables 

The differential equation in (15.22) 

f(yj)dy+g(yj) dl = 0 

may happen to possess the convenient property that the function/is in the variable y alone, 
while the function g involves only the variable t, so that the equation reduces to the special 
form 

f(y) dy + g(J) dt = 0 (15.23) 

In such an event, the variables are said to be separable, because the terms involving y— 
consolidated into f(y) can be mathematically separated from the terms involving f, 
which are collected under #(/). To solve this special type of equation, only simple integra¬ 
tion techniques are required. 


Example 1 


Solve the equation 3 y 2 dy - tdt = 0. First let us rewrite the equation as 

3y 2 dy = tdt 

Integrating the two sides (each of which is a differential) and equating the results, we get 

fiy 2 dy-ftdt or y l + ^-^-t 2 +c 2 
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Thus the general solution can be written as 

y 3 = ^f 2 + c or K0 = Qi 2 + c) 

The notable point here is that the integration of each term is performed with respect to 
a different variable; it is this which makes the separable-variable equation comparatively 
easy to handle. 

Example 2 $ olve the equation 2t dy+ y dt = 0. At first glance, this differential equation does not seem 

- to belong in this spot, because it fails to conform to the general form of (15.23). To be 

specific, the coefficients of dy and dt are seen to involve the "wrong" variables. However, a 
simple transformation—dividing through by 2yf(^0)—will reduce the equation to the 
separable-variable form 

'-dy+j t dt = 0 

Prom our experience with Example 1, we can work toward the solution (without first trans¬ 
posing a term) as follows:* 

/JWi -, d,=c 

so In y + ^ In f = c or In (yt 1/2 ) = c 

Thus the solution is 

yf 1/2 = = k or y (t) = kt ' 1/2 

where k is an arbitrary constant, as are the symbols c and A employed elsewhere. 

Note that, instead of solving the equation in Example 2 as we did, we could also have 
transformed it first into an exact differential equation (by the integrating factor y) and then 
solved it as such. The solution, already given in Example 1 of Sec. 15.4, must of course be 
identical with the one just obtained by separation of variables. The point is that a given dif¬ 
ferential equation can often be solvable in more than one way, and therefore one may have a 
choice of the method to be used. In other cases, a differential equation that is not amenable 
to a particular method may nonetheless become so after an appropriate transformation. 

Equations Reducible to the Linear Form 

If the differential equation dy/dt = h(y, t) happens to take the specific nonlinear form 

dv 

j + Ry = T/* ( 15 . 24 ) 

where R and T are two functions of t, and m is any number other than 0 and 1 (what if 
m = 0 or m = 1 ?), then the equation -referred to as a Bernoulli equation —can always be 
reduced to a linear differential equation and be solved as such. 


' In the integration result, we should, strictly speaking, have written In |y| and j In |f|. If y and t can 
be assumed to be positive, as is appropriate in the majority of economic contexts, then the result 
given in the text will occur. 
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Example 3 


Example 4 


The reduction procedure is relatively simple. First, we can divide (15.24) by v m , to get 

-Jy 


y 


dr 


Ry ' = T 


If wo adopt a shorthand variable z as follows: 
' = > 


'sotha x d Aj zd y ~ 


= (l - m )y~ m — 
dr dv dt y dt 


then the preceding equation can be written as 

1 dz 


- m dt 


Rz = T 


Moreover, after multiplying through by (I - m) dt and rearranging, we can transform the 
equation into 


dz + [(\-m)8z-{] - /w)7"] dl = 0 (15.24') 

This is seen to be a first-order linear differential equation of the form (15.20), in which the 
variable z has taken the place of y. 

Clearly, we can apply formula (15.21) to find its solution z{t). Then, as a final step, we 
can translate z back toy by reverse substitution. 

Solve the equation dy/dt + ty= 3 ty 2 . This is a Bernoulli equation, with m = 2 (giving us 
z=y Um = y- 1 ), r = t, and T = 3t. Thus, by (15 24'), we can write the linearized differ¬ 
ential equation as 

dz+(-tz+ 3f) dt = 0 

By applying formula (15.21), the solution can be found to be 

z(t) = 4exp(’t 2 ) + 3 

(As an exercise, trace out the steps leading to this solution.) 

Since our primary interest lies in the solution y (t) rather than z (t), we must perform a 
reverse transformation using the equation z ~ y~\ or y = z _1 . By taking the reciprocal of 
z (t), therefore, we get 

=-- 

A exp^t z ) + 3 

as the desired solution. This is a general solution, because an arbitrary constant A is present. 

Solve the equation dy/dt + (l/t)y = y 3 . Here, we havem= 3 (thus z= y -2 ), R = 1/t, and 
f = 1; thus the equation can be linearized into the form 

dz+^z+2^j dt = 0 

As you can verify, by the use of formula (15.21), the solution of this differential equation is 

z(t) = At 2 + 2t 
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It then follows, by the reverse transformation y = z 1/2 , that the general solution in the 
original variable is to be written as 

y(0 = (/^ + 2fr ,/2 

As an exercise, check the validity of the solutions of these last two examples by 
differentiation. 


EXERCISE 15.5 


1. Determine, for each of the following, (1) whether the variables are separable and (2) 
whether the equation is linear or else can be linearized: 

(a) 2tdy + 2ydt = 0 (Om----- 

dt y 


(b) 


y 


dy + 


1 1 


cft = 0 


(d)^ = 3y 2 t 


y+t y+t 

2. Solve (a) and (<t>) in Prob. 1 by separation of variables, taking y and t to be positive. 
Check your answers by differentiation. 

3. Solve (c) in Prob. 1 as a separable-variable equation and, also, as a Bernoulli equation. 

4. Solve (d) in Prob. 1 as a separable-variable equation and, also, as a Bernoulli equation. 

5. Verify the correctness of the intermediate solution z( f) = At 2 + 2t in Example 4 by 
showing that its derivative dz/dt is consistent with the linearized differential equation. 


15.6 The Qualitative-Graphic Approach _ 

The several cases of nonlinear differential equations previously discussed (exact differen¬ 
tial equations, separable-variable equations, and Bernoulli equations) have all been solved 
c/uaniitaiively. That is, we have in every case sought and found a time path v(f) which, for 
each value of r, tells the specific corresponding value of the variable y. 

At times, we may not be able to find a quantitative solution from a given differential 
equation. Yet, in such eases, it may nonetheless be possible to ascertain the qualitative 
properties of the lime path—primarily, whether y(t) converges—by directly observing the 
differential equation itself or by analyzing its graph. Even when quantitative solutions are 
available, moreover, we may still employ the techniques of qualitative analysis if the qual¬ 
itative aspect of the time path is our principal or exclusive concern. 


The Phase Diagram 

Given a first-order differential equation in the general form 


cly 

Tt 


~ f(y) 


either linear or nonlinear in the variable y, we can plot dyjdt against y as in Fig. 15.3. Such 
a geometric representation, feasible whenever dy/dt is a function ol'y alone, is called a 
phase diagram, and (he graph representing the function/ap/w.ve/me. (Adifiereniial equa¬ 
tion of this form—in which the time variable r does not appear as a separate argument of 
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the function ./—is said to be an autonomous differential equation.) Once a phase line is 
known, its configuration will impart significant qualitative information regarding the time 
path y(r). The clue to this lies in the following two general remarks: 

1. Anywhere fl/wvt! the horizontal axis {where tfy/dr > 0).v must be increasing over time 
and, as far as the v axis is concerned, must be moving from left to right. By analogous 
reasoning, any point below the horizontal axis must be associated with a leftward move¬ 
ment in the variable i\ because the negativity of dyjdt means that v decreases over time. 
These directional tendencies explain why the arrowheads on the illustrative phase lines 
in Fig. 15.3 are drawn as they are. Above the horizontal axis, the arrows are uniformly 
pointed toward the right—toward the northeast or southeast or due east, as the case may 
be. The opposite is true below they axis. Moreover, these results are independent of the 
algebraic sign of v; even if phase line A (or any other) is transplanted to the left of the 
vertical axis, the direction of the arrow s will not be affected. 

2. An equilibrium level ofy—in the intertemporal sense of the term if it exists, cun occur 
only on the horizontal axis, where dyjdt = 0 ly stationary over lime). To find an equi¬ 
librium. therefore, it is necessary only to consider the intersection of the phase line with 
the y axis. 1 To test the dynamic stability of equilibrium, on the other hand, we should 
also check whether, regardless of the initial position ofy, the phase line will always 
guide it toward the equilibrium position at the said intersection. 

Types of Time Path 

On the basis of the preceding general remarks, we may observe three different types of time 
path from the illustrative phase lines in Fig. 15.3. 

Phase line A has an equilibrium at point y„; but above as well as below that point, the 
arrowheads consistently lead away from equilibrium. Thus, although equilibrium can be 
attained if it happens thaty(O) = v the more usual caseofy(O) ^ y a will result iny being 
ever-increasing [if y(0) > y„] or ever-decreasing [if y(0) < y d ]. Besides, in this case the 
deviation ofy from y„ tends to grow at an increasing pace because, as we follow the 
arrowheads on the phase line, we deviate farther from they axis, thereby encountering ever- 
increasing numerical values of dyjdt as well. The time path y(t) implied by phase line A 
can therefore be represented by the curves shown in Fig. 15.4a, where y is plotted against t. 
(rather than dyjdt againsty). The equilibrium y,, is dynamically unstable. 

f However, not all intersections represent equilibrium positions. We shall see this when we discuss 
phase line C in Fig. 15.3. 
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FIGURE 15.4 



In contrast, phase line B implies a stable equilibrium at yv If y(0) = equilibrium 
prevails at once. But the important feature of phase line B is that, even if v(0) / yy. the 
movement along the phase lino will guide y toward the level ofyv The time path y(i) cor¬ 
responding to this type of phase line should therefore be of the form shown in Fig. 15.#, 
which is reminiscent of the dynamic market model, 

The preceding discussion suggests that, in general, it is the slope of the phase line at its 
intersection point which holds the key to the dynamic stability of equilibrium or the con¬ 
vergence of the time path. A (finite) positive slope, such as at point y a , makes for dynamic 
instability; whereas a (finite) negative slope, such as at yy, implies dynamic stability. 

This generalization can help us to draw qualitative inferences about given differential 
equations without even plotting their phase lines. Take the linear diiTerential equation in 
(15.4), for instance: 

dy dv 

— J rav = b or — = -av + b 
dt dt 


Since the phase line will obviously have the (constant) slope -a, here assumed nonzero, 
we may immediately infer (without drawing the line) that 


o £ 0 






converges to 
diverges from 


equilibrium 


As we may expect, this result coincides perfectly with what the quantitative solution of this 
equation tells us: 


y(t) = 


y{0) 



b 

a 


[from (15.5')] 


We have learned that, starting from a nonequilibrium position, the convergence of y(t) 
hinges on the prospect that e~ al 0 as / oc. This can happen if and only if a > 0; if 
a < 0, then e~ at —* oo as ( ^ cc, and y(t) cannot converge. Thus, our conclusion is one 
and the same, whether it is arrived at quantitatively or qualitatively. 

It remains to discuss phase line C, which, being a closed loop sitting across the hori¬ 
zontal axis, does not qualify as a function but shows instead a relation between dyjdt and 
y. + The interesting new element that emerges in this case is the possibility of a periodically 
fluctuating time path. The way that phase line C is drawn, we shall find y fluctuating 
between the two values y c and y[ in a perpetual motion. In order to generate the periodic 


T This can arise from a second-degree differential equation (dyjdt) 2 = f(y). 
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fluctuation, the loop must, of course, straddle the horizontal axis in such a manner that 
dyjdt can alternately be positive and negative. Besides, at the two intersection points y 
and y'„ the phase line should have an infinite slope; otherwise the intersection will resem¬ 
ble either y„ or yh- neither of which permits a continual flow' of arrowheads. The type of 
time path v(f) corresponding to this looped phase line is illustrated in Fig. 15.4c. Note that, 
whenever f( 0 hits the upper bound v' or the lower bound y 0 we have dyuh = 0 (local 
extrema): but these values certainly do not represent equilibrium values of y. In terms 
of Fig. 15.3, this means that not all intersections between a phase line and they axis are 
equilibrium positions. 

In sum. for the study of the dynamic stability of equilibrium (or the convergence of the 
time path), one has the alternative either of finding the time path itself or else of simply 
drawing the inference from its phase line. We shall illustrate the application of the latter 
approach with the Solow growth model. Henceforth, we shall denote the intertemporal 
equilibrium value ofy by y, as distinct from y'. 


EXERCISE 15.6 


1. Plot the phase line for each of the following, and discuss its qualitative implications: 






2. Plot the phase line for each of the following and interpret: 

(«)^=(y+ D 2 -16 (y>0) 

(b) Jt=l y ~ yZ (y ~ 0) 

3. Given dy/dt = (y- 3)(y- 5) = y 2 -8y + 15: 

(a) Deduce that there are two possible equilibrium levels of y, one at y = 3 and the 
other aty = 5. 

(b) Find the sign of 
these? 


d_ i dy 
dy\dt 


at y = 3 and y = 5, respectively. What can you infer from 


15.7 Solow Growth Model _ 

The growth model of Professor Robert Solow/ a Nobel laureate, is purported to show, 
among other things, that the razoris-edge growth path of the Domar model is primarily a 
result of the particular production-function assumption adopted therein and that, under 
alternative circumstances, the need for delicate balancing may not arise. 

The Framework 

In the Domar model, output is explicitly stated as a function of capital alone: k = pK (the 
productive capacity, or potential output, is a constant multiple of the stock of capital). The 

f Robert M. Solow, "A Contribution to the Theory of Economic Growth/' Quarterly journal o i 
Economics , February 1956, pp. 65-94. 
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absence of a labor input in the production function carries the implication that labor is 
always combined with capital in a fixed proportion, so that it is feasible to consider explic¬ 
itly only one of these factors of production. So low, in contrast, seeks to analyze the case 
where capital and labor can be combined in varying proportions. Thus his production 
function appears in the form 


Q = f(K,T) (K,L>0) 

where Q is output (net of depreciation), K is capital, and L is labor—all being used in the 
macro sense. It is assumed that and fi are positive (positive marginal products), and 
/kk and fn arc negative (diminishing returns to each input). Furthermore, the production 
function/is taken to be linearly homogeneous (constant returns to scale). Consequently, it 
is possible to write 

Q ~ Lf - L0(*) where k = j (15.25) 

In view of the assumed signs of fk and J'kk, the newly introduced <p function (which, be 
it noted, has only a single argument, k) must be characterized by a positive first derivative 
and a negative second derivative. To verify this claim, we first recall from (12.49) that 

f K = MPP K =<p ! {k) 


hcncc f K > 0 automatically means <p'{k) > 0. Then, since 
B , d<f>'(k) d* „ 1 

'» = «?* <*> =-$T air 


[sec (12.48)] 


the assumption f KK < 0 leads directly to the result 0''(£) < 0. Thus the <p function— 
which, according to (12.46), gives the APPi lor every capital-labor ratio—is one that 
increases with k at a decreasing rate. 

Given that Q depends on K and L, it is necessary now to stipulate how the latter two vari¬ 
ables themselves are determined. Solow’s assumptions arc: 

• ( dK \ 

K = —— = sQ [constant proportion of Q is invested] (15.26) 


L 

L 


dl/dt 

L 


— K 


(A>0) 


[labor force grows exponentially] (15.27) 


The symbol s represents a (constant) marginal propensity to save, and L a (constant) rate 
of growth of labor. Note the dynamic nature of these assumptions; they specify not how the 
levels of K and L are determined, but how their rales of change are. 

Equations (15.25) through (15.27) constitute a complete model. To solve this model, we 
shall first condense it into a single equation in one variable. To begin with, substitute 
(15.25) into (15.26) to get 


K =sL<p(k) 


(15.28) 


Since k = KjL h and K — kL, however, we can obtain another expression for K by differ¬ 
entiating the latter identity: 


K = Ik 4- kL [product rule] 
= Lic + kXL [by (15.27)] 


(15.29) 
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When (15.29) is equated to (15.28) and the common factor I eliminated, the result emerges 
that 

ic = S(l>{k) - kk (15.30) 

This equation—a differential equation in the variable k, with two parameters .v and k —is 
the fundamental equation of the Solow growth model. 

A Qualitative-Graphic Analysis 

Because (15.30) is slated in a general-function form, no specific quantitative solution is 
available. Nevertheless, we can analyze it qualitatively. To this end, we should plot a phase 
line, with ic on the vertical axis and k on the horizontal. 

Since (15.30) contains two terms on the right, however, let us first plot these as two sepa¬ 
rate curves. The kk term, a linear function of L will obviously show' up in Kig. 15.5a as a 
straight line, with a zero vertical intercept and a slope equal to a. The a<p(k) term, on the other 
hand, plots as a curve that increases at a decreasing rate, like since s<p{k) is merely a 
constant fraction of the $(k) curve. If we consider K to be an indispensable factor of produc¬ 
tion, wc must start the s<p(k) curve from the point of origin; this is because if K = 0 and thus 
k = 0, Q must also be zero, as will be tp(k) and s0(k). The way the curve is actually drawn 
also reflects the implicit assumption that there exists a set of k values for which $<p(k) 
exceeds kk, so that the two curves intersect at some positive value of k, namely k. 

Based upon these two curves, the value of k for each value of k can be measured by the 
vertical distance between the two curves. Plotting the values of k against k, as in Fig. 15.5 b, 
will then yield the phase line we need. Note that, since the two curves in Fig. 15.5a inter¬ 
sect when the capital-labor ratio is k, the phase line in Fig. 1 5.5b must cross the horizontal 
axis at k. This marks k as the intertemporal equilibrium capital-labor ratio. 

Inasmuch as the phase line has a negative slope at k, the equilibrium is readily identified 
as a stable one; given any (positive) initial value of*, the dynamic movement of the model 


FIGURE 15.5 
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must lead us convergently to the equilibrium level k. The significant point is that once this 
equilibrium is attained—and thus the capital-labor ratio is (by definition) unvarying over 
time—capital must thereafter grow apace with labor, at the identical rate a . This will imply, 
in turn, that net investment must grow at the rate a (see Exercise 15.7-2). Note, however, 
that the word must is used here not in the sense of requirement, but with the implication of 
automaticity, Thus, what the Solow model serves to show is that, given a rate of growth of 
labor a , the economy by itself, and without the delicate balancing a la Domar, can eventu¬ 
ally reach a state of steady growth in which investment will grow at the rale a , the same as 
K and I. Moreover, in order to satisfy (15.25). Q must grow at the same rate as well because 
4>(k) is a constant when the capital-labor ratio remains unvarying at the level k. Such a 
situation, in which the relevant variables all grow al an identical rate, is called a steady 
state —a generali/alion of the concept of stationan state (in which the relevant variables 
all remain constant, or in other words all grow at the zero rate). 

Note that, in the preceding analysis, the production function is assumed for convenience 
to be invariant over time. If the state of technology is allowed to improve, on the other hand, 
the production function will have to be duly modified. For instance, it may be written 
instead in the form 

Q=mf(K,L) (^><)) 

where 7", some measure of technology, is an increasing function of time. Because of the in¬ 
creasing multiplicative term ?'(/). a fixed amount of Aland I will turn out a larger output at 
a future date than at present. In this event, the s4>(k) curve in Fig. 15.5 will be subject to a 
secular upward shift, resulting in successively higher intersections with the kk ray and 
also in larger values of k. With technological improvement, therefore, it will become 
possible, in a succession of steady states, to have a larger and larger amount of capital 
equipment available to each representative worker in the economy, with a concomitant rise 
in productivity. 

A Quantitative Illustration 

The preceding analysis had to be qualitative, owing to the presence of a general function 
<p(k ) in the model. But if wc specify the production function to be a linearly homogeneous 
Cobb-Douglas function, for instance, then a quantitative solution can be found as well. 

Let us write the production Junction as 

Q = K a L l ~ a = L(j-'j =U“ 

so that 0(/e) = k ft . Then (15.30) becomes 

k = sk a -kk or k+kk = sk“ 

which is a Bernoulli equation in the variable k [see (15.24)]. with R =k,T = s. and 
m=<x. Lettingc = k ] we obtain its linearized version 


dz + [( 1 - a)kz - ( 1 - o').?] dt = 0 
dz 



502 Part Five !)\ nanii c A Hal v st\\ 


This is a linear differential equation with a constant coefficient a and a constant term b. 
Thus, by formula (15.5'), wc have 


z(t) = 



e 


( I -U'lA 



Fhc substitution ofr = k' u will then yield the final solution 




m 



■ (X-U)AI 


.V 


where A*(0) is the initial value of the capital-labor ratio k. 

This solution is what determines the time path of k. Recalling that 11 - a) and k are 
both positive, we see that as t -* so the exponential expression will approach zero: 
consequently. 

J \ I '( I -u) 

k'"' -^7 or k / r ) as t oc 


Therefore, the capital-labor ratio will approach a constant as its equilibrium value. This 
equilibrium or steady-state value, (x/a) l/f 1 w \ varies directly with (he propensity to save 
and inversely with the rate of growth of labor a. 


EXERCISE 15.7 

1. Divide (15,30) through by k, and interpret the resulting equation in terms of the 
growth rates of k, K t and L 

2. Show that, if capital is growing at the rate /. (that is, K = Ae' 1 }, net investment / must 
also be growing at the rate k. 

3. The original input variables of the Solow model are K and L, but the fundamental equa¬ 
tion (15.30) focuses on the capital-labor ratio k instead, What assumption(s) in the 
model is(are) responsible for (and make possible) this shift of focus? Explain. 

4 # Draw a phase diagram for each of the following, and discuss the qualitative aspects of 
the time path y(t): 

(a) y = 3-y-lny (b) y = e? - (y + 2) 



Chapter 


Higher-Order Differential 
Equations 


In Chap. 15, we discussed the methods of solving a first-order differential equation, one in 
which there appears no derivative (or differential) of orders higher than 1. At times, how¬ 
ever, the specification of a model may involve the second derivative or a derivative of an 
even higher order. We may, for instance, be given a function describing "the rate of change 
of the rate of change” of the income variable Y, say, 


d 2 Y 
dt 2 


= kY 


from which we are supposed to find the lime path of Y. In this event, the given function con¬ 
stitutes a second-order differential equation, and the task of finding the time path Y(t) is 
that of solving the second-order differential equation. The present chapter is concerned 
with the methods of solution and the economic applications of such higher-order differen¬ 
tial equations, but we shall confine our discussion to the linear case only. 

A simple variety of linear differential equations of order n is of the following form: 


d”v d"~'y dy 

'dF + cti d?^ + '" + a ^ l di + a ’ iy = h 


(16.1) 


or, in an alternative notation, 

y { *\t) +a x y {n -'\t) + • • • + + a„y - b (16.V) 


This equation is of order n, because the nth derivative (the first term on the loft) is the high¬ 
est derivative present. It is linear, since all the derivatives, as well as the dependent variable 
y, appear only in the first degree, and moreover, no product term occurs in which y and any 
of its derivatives arc multiplied together. You will note, in addition, that this differential 
equation is characterized by constant coefficients (the it’s) and a constant term (b). The con¬ 
stancy of the coefficients is an assumption we shall retain throughout this chapter. The 
constant term b, on the other hand, is adopted here as a first approach; later, in Sec. 16.5, 
we shall drop it in favor of a variable term. 
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16.1 Second-Order Linear Differential Equations 

wit h Constant Coefficients and Constant Term _ 

For pedagogic reasons, let us first discuss the method of solution for the second-order case 
{n = 2). The relevant differential equation is then the simple one 

y\t) +Q\/(t) +aiy = h (16.2) 

where a\. ai , and h are all constants. If the term b is identically zero, we have a homoge¬ 
neous equation, but if h is a nonzero constant, the equation is nonhomogeneous. Our 
discussion will proceed on the assumption that (16.2) is nonhomogeneous; in solving the 
nonhomogeneous version of (16.2), the solution of the homogeneous version will emerge 
automatically as a by-product. 

In this connection, we recall a proposition introduced in Sec, 15.1 which is equally 
applicable here: If jv is the complementary function , i.e., the general solution (containing 
arbitrary constants) of the reduced equation of (16.2) and if y p is the particular integral, i.e., 
any particular solution (containing no arbitrary constants) of the complete equation (16.2), 
thcny(/) = y t + y p will be the general solution of the complete equation. As was explained 
previously, they,, component provides us with the equilibrium value of the variable y in the 
intertemporal sense of the term, whereas the,tv component reveals, for each point of time, 
the deviation of the time path y(0 from the equilibrium. 

The Particular Integral 

For the case of constant coefficients and constant term, the particular integral is relatively 
easy to find. Since the particular integral can be any solution of (16.2), i.c., any value ofy 
that satisfies this nonhomogeneous equation, wc should always try the simplest possible 
type: namely, y = a constant, (ft- - a constant, it follows that 

v'(0 =/'</) = 0 

so that (16.2) in effect becomes aiy = h , with the solution v = bjai. Thus, the desired par¬ 
ticular integral is 

Vq = — (case of a 2 #0) (16.3) 

“ (li 

Since the process of finding the value of y p involves the condition y\t) = 0, the rationale 
for considering that value as an intertemporal equilibrium becomes self-evident. 

Find the particular integral of the equation 

/'(t) + /(0-2y=-10 

The relevant coefficients here are 02 = -2 and b= -10. Therefore, the particular integral is 
Yp = —10/(—2) = 5. 

What if ai = 0- so that the expression b/ai is not defined? In such a situation, since the 
constant solution fory ; , fails to work, we must try some nonconstant form of solution. Taking 
the simplest possibility, wc may try y = kt. Since 02 — 0, the differential equation is now 

y"(/) + aiy'(/) = i 


Example 1 
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but if y = kt, which implies y'(t) = k and y"(r) = 0, this equation reduces to a\k = h. 
This determines the value of k as b/a \. thereby giving us the particular integral 

y p = —i (case ofa* = 0; u\ ^ 0) (16.3') 

r a i 

Inasmuch as y p is in this case a nonconstant function of time, we shall regard it as a mov¬ 
ing equilibrium. 

Exam ole 2 Find the Vp of the equation y"(t) + y'(f) = -10. Here, we have 02 = 0, Oi =1, and 
- b = —10. Thus, by (16.3'), we can write 

Yp = -10t 

If it happens that a\ is also zero, then the solution form of y = kt will also break down, 
because the expression b! ja\ will now be undefined. We ought, then, to try a solution of the 
form y = kt*-. With U\ = ai = 0, the differential equation now reduces to the extremely 
simple form 

y'\t) = b 

and if i 1 = kt 2 , which implies v'(0 = 2kt and >■"(/) = 2/:, the differential equation can be 
wntten as 2k = b. Thus, we find k = b/2, and the particular integral is 

b , 

y p = ~r (case of «i = a 2 = 0) (16.3") 

The equilibrium represented by this particular integral is again a moving equilibrium. 


Example 3 


Find the y p of the equation y"(t) = -10. Since the coefficients are oi =o 2 = 0 and 
h- -10, formula (16.3") is applicable. The desired answer is y p = -5 1 2 . 


The Complementary Function 

The complementary function of (16.2) is defined to be the general solution of its reduced 
(homogeneous) equation 


/'(/)+«!>•'(0+a2>' = 0 (16.4) 

This is why wc stated that the solution of a homogeneous equation will always be a 
by-product in the process of solving a complete equation. 

Even though we have never tackled such an equation before, our experience with the 
complementary function of the first-order differential equations can supply us with a use¬ 
ful him. From the solutions (15.3). (15.3'), (15.5). and (15.5'), it is clear that exponential 
expressions of the form Ae yt figure very prominently in the complementary functions of 
first-order differential equations with constant coefficients. Then why not try a solution of 
the form y = Ae rl in the second-order equation, too? 

if we adopt the trial solution y = Ae", we must also accept 

y'(t) = rAe rt and y"(t)=r 2 Ae rl 
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as the derivatives of v. On the basis of these expressions for v, and y"(/|, the reduced 
diJTerential equation {16.4) can be transformed into 

Ae ri (r 2 + atr + a 2 ) = 0 (16.4') 

As long as we choose those values of A and r that satisfy (16.4'), the trial solution y = Ae" 
should work, Since e n can never be zero, we must either let A = 0 or see to it that r satis¬ 
fies the equation 

r 2 + a ]r +a 2 = 0 (16.4") 


Since the value of the (arbitrary) constant A is to be definitized by use of the initial condi¬ 
tions of the problem, however, we cannot simply set .4=0 at will. Therefore, it is essential 
to look for values of r that satisfy (16.4"). 

Equation (16.4") is known as the characteristic equation (or auxiliary equation ) of the 
homogeneous equation (16.4), or of the complete equation (16.2). Because it is a quadratic 
equation in r. it yields two roots (solutions), referred to in the present context as character¬ 
istic mots, as follows:' 


r\.r 2 


—e? i ± Ja 2 -4 a 


(16.5) 


These two roots bear a simple but interesting relationship to each other, which can serve as 
a convenient means of checking our calculation: The sum of the two roots is always equal to 
—Ai, and their product is always equal to a 2 . The proof of this statement is straightforward: 


-fl,+y«?- 482 -fl] - yjci] -4f?2 —2a 

i + ~2 = ^r 

(-a,) 2 - (a 2 -4a 2 ) 4 a 2 


r\+r 2 = 


r\r 2 = 


= -a\ 


(16.6) 


— a? 


4 4 

The values of these two roots are the only values wc may assign to r in the solution 
y = Ae". But this means that, in effect, there are two solutions which will work, namely, 

y i = A]e r ' 1 and y 2 = A 2 e >v 


where A\ and A 2 are two arbitrary constants, and r\ and r 2 are the characteristic roots 
found from (16.5). Since we want only one general solution, however, there seems to be 
one too many. Two alternatives are now open to us: (1) pick either vt or y 2 at random, or 
(2) combine them in some fashion. 

The first alternative, though simpler, is unacceptable. There is only one arbitrary con¬ 
stant inyi or vj, but to qualify as a general solution of a second-order differential equation, 
the expression must contain two arbitrary constants. This requirement stems from the fact 
that, in proceeding from a function y(t) to its second derivative y"{/), we “lose" two 
constants during the two rounds of differentiation; therefore, to revert from a second-order 
differential equation to the primitive function y(0> two constants should be reinstated. 
That leaves us only the alternative of combining yj and y 2 , so as to include both constants 


t Note that the quadratic equation ( 16.4 ') is in the normalized form; the coefficient of the r 2 term is 1. 
In applying formula (16.5) to find the characteristic roots of a differential equation, we must first 
make sure that the characteristic equation is indeed in the normalized form. 
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A i and A 2 . As it turns out. we can simply take their sum, yi + y 2 , as the general solution of 
(16.4). Let us demonstrate that, if y, and y 2t respectively, satisfy (16.4), then the sum 
O’i + yz) will also do so. Ifyi and y 2 are indeed solutions of (16.4). then by substituting 
each of these into (16.4). we must find that the following two equations hold: 

y "(0 + a u '](0 + = o 

>’j(/) +0|>i(/) +tl2)>2 = 0 

By adding these equations, however, we find that 

b \ to+>2^)] + 0 ] r»(0+>2(0] + «2(>’i+>2)=0 

= 4(V, Ivi) =3 tO : i+J-?) 

dl‘ 

Thus, likeyi ory 2 , the sum(>i + y 2 ) satisfies the equation (16.4) as well. Accordingly, the 
general solution of the homogeneous equation (16.4) or the complementary function of the 
complete equation (16.2) can, in general, be written asy r = >1 + >’ 2 . 

A more eareful examination of the characteristic-root formula (16.5) indicates, however, 
that as far as the values of r\ and r 2 are concerned, three possible cases can arise, some of 
which may necessitate a modification of our result yy — y \ + v 2 . 

Case J (distinct real roots) When a 2 > 4u 2 , the square root in (16.5) is a real number, 
and the two roots / ) and r 2 will take distinct real values, because the square root is added to 
—A] for r 1 . but subtracted from -a\ foir 2 . In this case, we ean indeed write 

><■ = Vi + >b - A^' 1 + A 2 e'“ (n 4 r 2 ) (16.7) 

Because the two roots are distinct, the two exponential expressions must be linearly inde¬ 
pendent (neither is a multiple of the other); consequently, A x and A? will always remain as 
separate entities and provide us with two constants, as required. 


Example 4 


Solve the differential equation 

y"(0 + /(0-2y = -10 


The particular integral of this equation has already been found to be y p = 5, in Example 1. 
Let us find the complementary function. Since the coefficients of the equation are 01 = 1 
and 02 = -2, the characteristic roots are, by (16.5), 


-1 ± /l + 8 
t\,n = -=- 


-1 ±5 
2 


= 1,-2 


(Check: q + r 2 = -1 = -O]; q r 2 = -2 = 02 .) Since the roots are distinct real numbers, 
the complementary function is y c = hi e 1 + A 2 e 2t . Therefore, the general solution can be 
written as 


y(t) = yc + y P =Me t +A z e 2t + $ (16.8) 

In order to definitize the constants A\ and A 2 , there is need now for two initial condi¬ 
tions. Let these conditions be y(0) = 12 and y'(0) = -2. That is, when f = 0, y(f) and y'(t) 
are, respectively, 12 and -2. Setting t = 0 in (16.8), we find that 


y(0) = Ay -f A 2 +5 
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Differentiating (16.8) with respect to t and then setting t = 0 in the derivative, we find that 

y'(t) = A-\e‘ - 2A 2 e~ 2: and y'(0) = A\ - 2 A 2 

To satisfy the two initial conditions, therefore, we must set y(0) = 12 and y'( 0) = -2, 
which results in the following pair of simultaneous equations: 

* 1 + A 2 = 7 
A i -2A 2 = -2 

with solutions ^i=4 and A 2 = 3. Thus the definite solution of the differential equation is 

y(f) = 4e‘ + 3e~ 2t + 5 (16.8') 

As before, we can check the validity of this solution by differentiation. The first and 
second derivatives of (16.8') are 

y'(t) = 4e ! - 6e 21 and y''(t) = 4e' +1 2e~ 2t 

When these are substituted into the given differential equation along with (16.8'), the result 
is an identity -10 = -10. Thus the solution is correct. As you can easily verify, (16.8') also 
satisfies both of the initial conditions. 

Case 2 (repeated real roots) When the coefficients in the differential equation are such 
that aj = 4a 2 . the square root in (16.5) will vanish, and the two characteristic roots take an 
identical value: 


“\ 

r(=ri =r 2 ) = - — 

Such roots arc known as repeated roots , or multiple, (here, double) wots. 

If wc attempt to write the complementary function asy, = jq +>’ 2 . the sum will in this 
case collapse into a single expression 

y, - A { e r ' + A 2 e n = (At + A 2 )e rl = A } e" 

leaving us with only one constant. This is not sufficient to lead us from a second-order 
differential equation back to its primitive function. The only way out is to find another eli¬ 
gible component term for the sum- a term which satisfies (16.4) and yet which is linearly 
independent of the term A 2 e r/ , so as to preclude such "collapsing.'' 

An expression that will satisfy these requirements is Aite 1 ''. Since the variable t has 
entered into it multiplicatively. this component term is obviously linearly independent of 
the A^e 1 ' 1 term; thus it will enable us to introduce another constant, A 4 . But does A 4 te rl 
qualify as a solution of (16.4)? If wc try y = A A te r! , then, by the product rule, we can find 
its first and second derivatives to be 

y'(t) = (rf + [)A A e rl and y”{t) = (r 2 t + 2r)A4e" 

Substituting these expressions of y, y', and y" into the left side of (16.4), we get the 
expression 

[{r 2 t + 2r) T U}(rt + 1) 4- a 2 l]A A e" 
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Example 5 


Inasmuch as. in the present context, we have a 3 = 4ai andr — -«i/2. this last expression 
vanishes identically and thus is always equal to the right side of (16.4); this shows that 
Ante'' docs indeed qualify as a solution. 

Hence, the complementary function of the double-root case can be written as 

y<- = A 3 e rl + A 4 /e r < (16.9) 


Solve the differential equation 

y"(t) + 6y‘(t) + 9y = 27 

Here, the coefficients are O] = 6 and ai = 9; since of = 4 o^, the roots will be repeated. 
According to formula (16.5), we have r = -m/2 = -3. Thus, in line with the result in 
(16.9), the complementary function may be written as 

y c = A 3 e~ 3t + A 4 te~ 3c 

The general solution of the given differential equation is now also readily obtainable. 
Trying a constant solution for the particular integral, we get y p = 3. It follows that the 
general solution of the complete equation is 

y(0 = y< + y P = A 3 e~ 3t + A 4 te~ 3t + 3 

The two arbitrary constants can again be definitized with two initial conditions. Suppose 
that the initial conditions are y{0) = 5 and y'(0) = -5. By setting f = 0 in the preceding 
general solution, we should find y(0) = 5; that is, 

y(0) - A* + 3 = 5 

This yields Ai = 2. Next, by differentiating the general solution and then setting t = 0 and 
also A 3 = 2, we must have y'(0) = -5. That is, 

y'(t) =-3A}e~ 3t - 3A 4 te 31 + A 4 e~ 3< 
and y'( 0) = - 6 + /U = -5 

This yields = 1. Thus we can finally write the definite solution of the given equation as 

y(t) = 2e- 3f + t«r 3( + 3 

Case 3 (complex roots) There remains a third possibility regarding the relative magni¬ 
tude of the coefficients a\ and <72, namely, a] < Aa^. When this eventuality occurs, formula 
(16.5) will involve the square root of a negative number, which cannot be handled before 
we are properly introduced to the concepts of imaginary and complex numbers. For the 
time being, therefore, we shall be content with the mere cataloging of this case and shall 
leave the full discussion of it to Secs. 16.2 and 16.3. 

The three cases cited can be illustrated by the three curves in Fig. 16.1, each of which 
represents a different version of the quadratic function f\r) = r 2 + a\r A-cii- As we 
learned earlier, when such a function is set equal to zero, the result is a quadratic equation 
J{r) = (J, and to solve the latter equation is merely to “find the zeros of the quadratic 
function Graphically, this means that the roots of the equation arc to be found on the 
horizontal axis, where j\r) = 0. 

The position of the lowest curve in Fig. 16.1, is such that the curve intersects the hori¬ 
zontal axis twice; thus w r e can find two distinct roots r\ and r-i, both of which satisfy the 
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quadratic equation f{r) = 0 and both of which, of course, are real-valued. Thus the lowest 
curve illustrates Case L Turning to the middle curve, we note that it meets (he horizontal 
axis only once, at r 3 . This latter is the only value of r that can satisfy the equation /(r) - 0. 
Therefore, the middle curve illustrates Case 2. Last, we note that the top curve does not 
meet the horizontal axis at all, and there is thus no real-valued root to the equation 
f(r) = 0. While there exist no real roots tn such a case, there are nevertheless two complex 
numbers that can satisfy the equation, as will be shown in Sec. 16.2. 

The Dynamic Stability of Equilibrium 

For Cases 1 and 2, the condition for dynamic stability of equilibrium again depends on the 
algebraic signs of the characteristic roots. 

For Case 1, the complementary function (16.7) consists of the two exponential expres¬ 
sions A\e r, ‘ and A^e r ‘’. The coefficients A\ and Ai are arbitrary constants; their values 
hinge on the initial conditions of the problem. Thus we can be sure of a dynamically stable 
equilibrium (>y -* 0 as t cc), regardless of what the initial conditions happen to be, if 
and only if the roots r, and n are both negative. We emphasize the word both here, because 
the condition for dynamic stability docs not permit even one of the roots to be positive or 
zero. If n = 2 and r% = -5, for instance, it might appear at lirst glance that the second 
root, being larger in absolute value, can outweigh the first. In actuality, however, it is the 
positive root that must eventually dominate, because as t increases, e 2 ' will grow increas¬ 
ingly larger, but e~~‘ will steadily dwindle away. 

For Case 2, with repeated roots, the complementary function (16.9) contains not only 
the familiar e 1 ' 1 expression, but also a multiplicative expression te rl . For the former term to 
approach zero whatever the initial conditions may be, it is necessary-and-sullicient to have 
r < 0. But would that also ensure the vanishing of te n l As it turns out, the expression te rl 
(or, more generally, t k e r ') possesses the same general type of time path as does e r! (r 0). 
Thus the condition r < 0 is indeed necessary-and-sufficient for the entire complemen¬ 
tary function to approach zero as t -+ oo, yielding a dynamically stable intertemporal 
equilibrium. 
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EXERCISE 16.1 

1. Find the particular integral of each equation: 

(a) y"(t ) ~2y'(t) + 5 y=2 (d) y"(t) + 2 y'(t) - y = -4 

(b) y , ‘(t)+y , (t) = 7 (e) y"(t)=12 

(c) y"(D + 3y = 9 

2. Find the complementary function of each equation: 

(a) y"(t) f 3y'(t) -4y- 12 (c) y"(0 - 2/(0 ■+ y = 3 

(. b ) y"(r) + 6y'(0 + 5y= 10 (d) y"(0 + 8y'(0 + 16y = 0 

3. Find the general solution of each differential equation in Prob. 2, and then definitive 
the solution with the initial conditions y(0) = 4 and y'(0) = 2. 

4. Are the intertemporal equilibriums found in Prob. 3 dynamically stable? 

5. Verify that the definite solution in Example 5 indeed (a) satisfies the two initial condi¬ 
tions and ( b ) has first and second derivatives that conform to the given differential 
equation. 

6 . Show that, as t -» oo, the limit of te n is zero if r < 0, but is infinite if r > 0. 

16.2 Complex Numbers and Circular functions _ 

When the coefficients of a second-order linear differential equation, y"(i) + «iy'(/) + 
U 2 y = b , are such that c/f < 4a2. the characteristic-root formula (16.5) would call for tak¬ 
ing the square root of a negative' number. Since the square of any positive or negative real 
number is invariably positive, whereas the square of zero is zero, only a nonnegative real 
number can ever yield a real-valued square root. Thus, if we confine out attention to (ho 
real number system, as we have so far, no characteristic roots are available for this case 
(Case 3). This fact motivates us to consider numbers outside of the real-number system. 

Imaginary and Complex Numbers 

Conceptually, it is possible to define a number / = >/-T, which when squared will equal 
-1. Because i is the square root of a negative number, il is obviously not real-valued: it is 
therefore referred to as an imaginary number. With it at our disposal, we may write a host 
of other imaginary numbers, such as V-9 = v/h/M = 3 i and ^-2 = -Jli. 

Extending its application a step further, we may construct yet another type of number 
one that contains a real part as well as an imaginary part, such as (8 - i) and (3 + 5/). 
Known as complex numbers, these can be represented generally in the form (h + vi), 
where k and v are two real numbers/ Of course, in case i> = 0, the complex number will 
reduce to a real number, whereas if h = 0, it will become an imaginary number. Thus the 
set of all real numbers (call it R) constitutes a subset of the set of all complex numbers (call 
it C). Similarly, the set of all imaginary numbers (call it I) also constitutes a subset of C. 
That is. R c C, and I c C. Furthermore, since the terms real and imaginary arc mutually 
exclusive, the sets R and I must be disjoint; that is R n I = 0. 


1 We employ the symbols h{for horizontal) and v (for vertical) in the general complex-number 
notation, because we shall presently plot the values of hand v, respectively, on the horizontal and 
vertical axes of a two-dimensional diagram. 
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FIGURE 16.2 



A complex number {h + vi) can be represented graphically in what is called an Argand 
diagram, as illustrated in Fig. 16.2. By plotting h horizontally on the real axis and i> verti¬ 
cally on the imaginary axis, the number (h + vi) can be specified by the point {h, u), which 
we have alternatively labeled C. The values of h and v arc algebraically signed, of course, 
so that if h < 0, the point C will be to the left of the point of origin; similarly, a negative v 
will mean a location below the horizontal axis. 

Given the values of h and v, we can also calculate the length of the line OC by applying 
Pythagoras's theorem, which states that the square of the hypotenuse of a right-angled 
triangle is the sum of the squares of the other two sides. Denoting the length of OC by R 
(for radius vector), we have 

R 2 = h z + v 2 and R = v 2 (16.10) 

where the square root is always taken to be positive. The value ofR is sometimes called the 
absolute value, or modulus, of the complex number (h + vi). (Note that changing the signs 
of h and v will produce no effect on the absolute value of the complex number, R.) Like h 
and u. then. R is real-valued, but unlike these other values, R is always positive. We shall 
find the number R to be of great importance in the ensuing discussion. 

Complex Roots 

Meanwhile, let us return to formula (16.5) and examine the ease of complex characteristic 
roots. When the coefficients of a second-OTdcr differential equation are such that a] < 4ai , 
the square-root expression in (16.5) can be written as 

■ja] - 4«2 = fiat - afV-T = fiat ~ a\i 

Hence, if we adopt the shorthand 


h = fi - and 

the two roots can be denoted by a pair of conjugate complex numbers: 

r\. r 2 = h ± vi 
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These two complex roots arc said to be “conjugate" because they always appear together, 
one being the sum of h and vi, and the other being the difference between h and vi. Note 
that they share the same absolute value R. 


Example 1 


Find the roots of the characteristic equation r 2 + r ■+■ 4 = 0. Applying the familiar formula, 
we have 


-1 ±715^1 -1 VTI 

r „'2- —-=-^-- T 1 

which constitute a pair of conjugate complex numbers. 

As before, we can use (1 6.6) to check our calculations. If correct, we should have 
fi + f 2 = —0i (= -1) and /V 2 = 02 (= 4). Since we do find 


n + '2 = 


-1 

T 


yi5/ 
2 



715/ 

2 


-1 

T 






2 


and 


r\r 2 = 


-1 

T 


Vi 5/ 

2 


-i ffXli 

2 ‘ 2 



VTItV 1 -15 

“I”) ~ 4 4~ 


our calculation is indeed validated. 

Even in the complex-root case (Case 3), we may express the complementary function of 
a differential equation according to (16.7); that is. 

y c = A\e ll,+V,)l + = e l "{A ] e Ml + /fje -1 ''') (16.11) 

But a new feature has been introduced: the number i now appears in the exponents of the two 
expressions in parentheses. How do we interpret such imaginary exponential functions? 

To facilitate their interpretation, it wi II prove helpful first to transform these expressions 
into equivalent circular-function forms, As we shall presently see, the latter functions char¬ 
acteristically involve periodic fluctuations of a variable. Consequently, the complementary 
function (16.11), being translatable into circular-function forms, can also be expected to 
generate a cyclical type of lime path. 

Circular Functions 

Consider a circle with its center at the point of origin and with a radius of length R, as 
shown in Fig. 16.3. Let the radius, like the hand of a clock, rotate in the counterclockwise 
direction. Starting from the position OA, it will gradually move into the position OP, fol¬ 
lowed successively by such positions as OR, OC, and OD: and at the end of a cycle, it will 
return to OA. Thereafter, the cycle will simply repeat itself. 

When in a specific position say, OP —the clock hand will make a definite angle 0 with 
line OA, and the tip of the hand (P) will determine a vertical distance v and a horizontal dis¬ 
tance h. As the angle 6 changes during the process of rotation, v and h will vary, although 




514 Part Five D i mam ic A n a (vs is 


FIGURE 16.3 



R will not. Thus the ratios v/R and h/R must change with 6\ that is. these two ratios arc 
both functions of the angle 8, Specifically, v/R and h/R are called, respectively, the sine 
(function) of 8 and the cosine (function) of 8: 

sill 0 = - (16.12) 

R 

cosfl^ (16.13) 

In view of their connection with a circle, these functions are referred to as circular func¬ 
tions. Since they are also associated with a triangle, however, they are alternatively called 
trigonometric functions. Another (and fancier) name for them is sinusoidal functions. The 
sine and cosine functions are not the only circular functions; another frequently encoun¬ 
tered one is the tangent function, defined as 

tan 0 — ^ = l - (hf 0) 
coso h 

Our major concern here, however, will be with the sine and cosine functions. 

The independent variable in a circular function is the angle so the mapping involved 
here is from an angle to a ratio of two distances. Usually, angles are measured in degrees 
(for example. 30.45. and 90°); in analytical work, however, it is more convenient to mea¬ 
sure angles in radians instead. The advantage of the radian measure stems from the fact 
that, when ff is so measured, the derivatives of circular functions will come out in neater 
expressions much as the base e gives us neater derivatives for exponential and logarith¬ 
mic functions. But just how much is a radian? To explain this, let us return to Fig. 16.3, 
where we have drawn the point P so that the length of the arc AP is exactly equal to the 
radius R. A radian (abbreviated as rad ) can then be defined as the size of the angle 6 
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(in Fig. 16.3) formed by such an R -length arc. Since the circumference of the circle has 
a total length of 2nR (where t = 3.14159...). a complete circle mikst involve an angle 
ol 2.7 rad altogether. In terms of degrees, however, a complete circle makes an angle 
ot 360°; thus, by equating 360° to 2 jt rad, we can arrive at (he following conversion 
table: 


Degrees 

360 

i 270 

180 

90 

45 

0 

Radians 

2,7 

3.7 

7 

7 

7 

0 



T 


2 

4 



Properties of the Sine and Cosine Functions 

Given the length of R, the value of $in(9 hinges upon the way the value of v changes in re¬ 
sponse to changes in the angle 0, In the starting position OA, we have i; - 0. As the clock 
hand moves counterclockwise, u starts to assume an increasing positive value, culminating 
in the maximum value of v = R when the hand coincides with OB, that is. when 0 = 
nil rad(= 90). Further movement will gradually shorten v, until its value becomes zero 
when the hand is in the position OC, i.e,. when 0 = n rad (= 180 '). As the hand enters the 
third quadrant, v begins to assume negative values; in the position Of), we have v = -R. 
In the fourth quadrant, i; is still negative, but it w'ill increase from the value of -R toward 
the value of v = 0. which is attained when the hand returns to OA —that is, when 0 = 
2tt rad (= 360')- The cycle then repeats itself. 

When these illustrative values of u arc substituted into (16.12), wc can obtain the results 
shown in the "sin 8” row of Table 16.1. For a more complete description of the sine func¬ 
tion, however, see the graph in Fig. I b.Aa, where the values of sin 8 are plotted against those 
of 8 (expressed in radians). 

The value of cos 8, in contrast, depends instead upon the way that h changes in response 
to changes in 8. In the starting position OA, we have h = R. Then h gradually shrinks, till 
h = 0 when 8 = njl (position OB). In the second quadrant, h turns negative, and when 
8—71 (position OC), h = -R. The value of h gradually increases from -R to zero in the 
third quadrant, and when 8 = 3jt/ 2 (position OD), we find that h = 0. In the fourth quad¬ 
rant, h turns positive again, and when the hand returns to position OA (8 = 2,t ), u'e again 
have h = R. The cycle then repeats itself. 

The substitution of these illustrative values of into (16.13) yields the results in the 
bottom row ofTable 16.1. but Fig. 16.46 gives a more complete depiction of the cosine 
function. 

The sin 8 and cos 0 functions share the same domain, namely, these! of all real numbers 
(radian measures of d). In this connection, it may be pointed out that a negative angle 
simply refers to the reverse rotation of the clock hand; for instance, a clockwise movement 




i 


3 


0 

0 


n 

-7 

2 n 



2 


2 


sin 8 

0 

i 

0 

-1 

0 

cos 0 

i 

0 

-1 

0 

1 


TABLE 16.1 
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FIGURE 16.4 




from 0,4 to OD in Fig. 16.3 generates an angle of -tt/ 2 rad (= -90'). There is also a 
common range for the two functions, namely, the closed interval [-1. 1]. For this reason, 
the graphs of sin 8 and cos 8 arc, in Fig. 16.4, confined to a definite horizontal band. 

A major distinguishing property of the sine and cosine functions is that both are peri¬ 
odic: their values will repeat themselves for every In rad (a complete circle) the angle B 
travels through. Each function is therefore said to have a period ol'2n. In view of this 
periodicity feature, the following equations hold (for any integer«): 

sin(0 + 2 «jt) = sin0 cos(0 +2hjt) = cosfl 

That is, adding (or subtracting) any integer multiple of 2n to any angle 0 will affect neither 
the value of sin d nor that of cos 6. 

The graphs of the sine and cosine functions indicate a constant range of fluctuation in 
each period, namely, ±1, This is sometimes alternatively described by saying that the 
amplitude of fluctuation is 1. By virtue of the identical period and the identical amplitude, 
we sec that the cos 6 curve, if shifted rightward by tt/2, will be exactly coincident with the 
sinf) curve. These two curves are therefore said to differ only in phase, i.e.. to differ only 
in the location of the peak in each period. Symbolically, this fact may be stated by the 
equation 


costf = sin 


0 + 


n 

2 
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The sine and cosine functions obey certain identities. Among these, the more frequently 
used are 


si n (-ft) = -sinft 
cos(—0) = cos ft 

sin 2 ft + cos 2 ft = I [wheresin 2 ft = (smft) 2 , etc.] 

sin(ft| ± ftj) = sin ft] cos (ft ± cos ft] sin ft 2 
eos(ft| ± (ft) = cosftj cosft 2 ^ sinft| sin ft 2 

The pair of identities (16.14) serves to underscore the fact that the cosine function is sym¬ 
metrical with respect to the vertical axis (that is, ft and -ft always yield the same cosine 
value), while the sine function is not. Shown in (16.15) is the fact that, for any magnitude 
of ft, the sum of the squares of its sine and cosine is always unity. And the set of identities 
in (16.16) gives the sine and cosine of the sum and difference of two angles fti andft 2 . 

Finally, a word about derivatives. Being continuous and smooth, both sin ft and cos ft are 
differentiable. The derivatives, d(sinft)/dft and d(cosft)/dft, are obtainable by taking the 
limits, respectively, of the difference quotients A(sinft)/Aft and A(cosft)/Aft as Aft 0. 
The results, stated here without proof, arc 

d 

— sinft = cosft (16.17) 

d$ 

~ cosft = - sinft (16.18) 

du 

It should be emphasized, however, that these derivative formulas arc valid only when ft is 
measured in radians; if measured in degrees, for instance, (16.17) will become d( sinft)/ 
dB = (jt/L 80) cos ft instead. It is for the sake of getting rid of the factor (jr/180) that radian 
measures are preferred to degree measures in analytical work. 

Example 2 Fincl the slope of the sin0 curve at y = jT / 2 - The slo P e of the sine curve is given by its 

- derivative (= cosft). Thus, at ft = ,t/ 2, the slope should be cos (n/2) = 0. You may refer to 

Fig. 16.4 for verification of this result. 


(16.14) 

(16.15) 

(16.16) 


Example 3 


Find the second derivative of sinft. From (16.17), we know that the first derivative of sinft is 
cosft, therefore the desired second derivative is 

d 2 . d 

•, . sinft = —cosft = -sinft 

dft 2 do 


Euler Relations 

In See. 9.5, it was shown that any function which has finite, continuous derivatives up to the 
desired order can be expanded into a polynomial function. Moreover, if the remainder term 
R n in the resulting Taylor series (expansion at any point xq) or Maclaurin series (expansion 
at xo = 0) happens to approach zero as the number of terms // becomes infinite, the poly¬ 
nomial may be written as an infinite series. We shall now expand the sine and cosine func¬ 
tions and then attempt to show how the imaginary exponential expressions encountered in 
(16.11) can be transformed into circular functions having equivalent expansions. 
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For the sine ftinction, write tp(f>) = sin#; it then follows that 0(0) = sinO = 0. By 
successive derivation, we can get 


4>'{8) = co s& 
0"( 0) = -sin# 


<p\0) = cosO = I 
0"(O) = -sinO = 0 


0"'(#) = -cos# 
0 I4, (#) = sin# 
0 I5) (#) = cos# 


0'"(O) = -cosO = - I 
0 l4j (O) = sinO = 0 
0 (?i ( 0) = cosO = 1 


When substituted into (9.14), where 9 now replaces x, these will give us the following 
Maclaurin series with remainder: 


# 3 # 5 
sin#=0 + # + 0- — -t-0+ — 


+ ••• + 


^ +l) (P) nn + l 
(n + l)\ 


Now, the expression ^ n ^ [ \p) in the last (remainder) term, which represents the (,n + l)st 
derivative evaluated at 0 = p, can only lake the form of ±cos p or ±sin p and. as such, can 
only take a value in the interval [-1,1], regardless of how large n is. On the other hand, 
{rt + 1)! wil I grow rapidly as n co—in fact, much more rapidly than 0 H11 as n increases. 
Hence, the remainder term will approach zero as n -> oc, and we can therefore express the 
Maclaurin series as an infinite series: 


Q 5 e 1 

™ e = 0 -v + v.-v. + 


(16.19) 


Similarly, if we write 0(0) = cos#, then 0(0) = cosO = 1. and the successive deriva¬ 
tives will be 


\j/'(Q) = -sin# 
f'\9) = -cos 9 
0”'(#) = sin# 

0 ,4) (#) = cos# 
0< 5) (#) = -sin# 


0'(O) = -sinO = 0 
0"(O)^-cosO = -l 
0"'(O) = sinO = 0 
0 |4) (O) = cos0;= 1 
0 |S| (O) = -sinO = O 


On the basis of these derivatives, we can expand cos# as follows: 
cos# = 1 


o-ri 

2 ! 


# 4 

0+ ^ 




1 )! 


Since the remainder term will again tend toward zero as n ->• co, the cosine ftinction is also 
expressible as an infinite series, as follows: 


02 

cos 6 = 1 - — 


# 4 _#* 
4! 6! 


(16.20) 


You must have noticed that, with (16.19) and (16.20) at hand, wc are now capable of 
constructing a table of sine and cosine values for all possible values of 6 (in radians). How¬ 
ever, our immediate interest lies in finding the relationship between imaginary exponential 
expressions and circular functions. To this end, let us now expand the two exponential 
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expressions e' n and e The reader will recognize that ihcsc are but special cases of the 
expression e J . which has previously been shown, in (10.6), to have the expansion 


e* = l+x + -x 


1 ■) I , 

3?' + - X 


41 


Letting x = if), therefore, we can immediately obtain 

(if)) 2 (iV) 2 (Hi) 4 (r'f9) 5 


e lil = l + /0 + 


= \+i9~ 

= h- 


2! 3! 4! 

6- id 2 f ) 4 r'6 3 

2! 37 + 47 + ~5\ 

B 2 d 4 x / 


5! 


$ s V s 


2! ' 4! -J +/ ^ _ IT + 5!- 


Similarly, by setting x = -i9, the following result will emerge; 


■•o = i - i0 + t!!!l + (- i9) ' H *) 4 (-^) s 


= i - w - 


2! 

ff_ if_ 

2! + 3! 


3! 

4! 


4! 


5! 


\0 i 

~5l 


/, o 2 e 4 \ ( o 2 e> 

= ( 1 -2r + 4F--)- , ( fl -3r + 5! 


By substituting (16.19) and (16.20) into these two results, the following pair of identities— 
known as the Euler relations —can readily be established: 

= cosfl +isin(9 (16.21) 

e~ iB ~ cos #9 - i sin 0 (16.2V) 

These will enable us to translate any imaginary exponential function into an equivalent 
linear combination of sine and cosine functions, and vice versa. 


Example 4 


Find the value of e l7T . First let us convert this expression into a trigonometric expression. By 
setting 0 = * in (16.21), it is found that e" 1 - cos* + /sin*. Since cos* = -1 and 
sin* = 0, it follows that e' T = -1. 


Example 5 Show that e 17112 = -/. Setting 9 = nj2 in (16.21), we have 

e~' n! ' z = cos^ - i sin ^ - 0 - /(I) = -i 


Alternative Representations of Complex Numbers 

So far, we have represented a pair of conjugate complex numbers in the general form 
(h ± vi). Since h and v refer to the abscissa and ordinate in the Cartesian coordinate sys¬ 
tem of an Argand diagram, the expression (h ± vi) represents the Cartesian form of a pair 
of conjugate complex numbers. As a by-product of the discussion of circular functions and 
Euler relations, we can now express (h ± vi) in two other ways. 
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Example 6 


Example 7 


Referring lo Fig. 16.2. we see that as soon as h and v are specified, the angle 8 and the 
value of R aiso become determinate, Since a given 0 and a given R can together identify a 
unique point in the Argand diagram, we may employ# and R to specify the particular pair 
of complex numbers. By rewriting the definitions of the sine and cosine functions in 
(16.12) and (16.13) as 

r=/fsin# and h = R cos# (16.22) 

the conjugate complex numbers (h ± tv) can be transformed as follows: 

h ± vi = R cos# ± Ri sin# = R(w$0 ± / sin#') 

In so doing, we have in effect switched from the Cartesian coordinates of the complex 
numbers (h and i 1 ) to what are called their polar coordinates (R and 8). The right-hand 
expression in the preceding equation, accordingly, exemplifies the polar form ol a pair ot 
conjugate complex numbers. 

Furthermore, in view of the Euler relations, the polar form may also be rewritten into the 
exponential form as follows: R{co$0 ± i sin#) - Re : '". Hence, wc have a total of three 
alternative representations of the conjugate complex numbers: 

h t in' = R( cos# ±/ sin#) = Re* 1 " (16.23) 

If wc are given the values of R and 0, the transformation to h and v is straightforward: 
we use the two equations in (16.22). What about the reverse transformation? With given 
values of h and t 1 , no difficulty arises in finding the corresponding value of R , which is 
equal to + v 2 . But a slight ambiguity arises in regard to 8: the desired value of# (in 
radians) is that which satisfies the Iwo conditions cos# = h/R and sin# — viR: but for 
given values of h and v.dis not unique! (Why?) Fortunately, the problem is not serious, for 
by confining our attention to the interval [0, 2n) in the domain, the indetermrnancy is 
quickly resolved. 

Find the Cartesian form of the complex number Se 3 '"' 2 . Here we have R = 5 andn = 3,7/2; 
hence, by (1 6.22) and Table 16.1, 

fr = 5cas^-0 and v = 5sin^ = -5 

The Cartesian form is thus simply h-vi = -Si. 

Find the polar and exponential forms of (1 + >/3 i). In this case, we have h = 1 and v = •/3; 
thus R = VI +3 = 2. Table 16.1 is of no use in locating the value of 0 this time, but 
Table 16.2, which lists some additional selected values of sin# and cos#, will help. Specifically, 


9 

7T 


7T 

71 


in 

l 


4 

3 


4 

sin# 

i 

1 


vl 

1 | 

( y/2\ 

2 



2 

V2 1 

f 

cos# 

V3 

1 

1 72\ 

1 

-1 

( -J2\ 

2 

H 1 

H 

2 

V2 

r 2 ) 


TABLE 16.2 
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we are seeking the value of ft such that cos ft = hjR = 1/2 and sin ft = v/R = y/l/2. The 
value 9 = ,t/3 meets the requirements. Thus, according to (16.23), the desired transforma¬ 
tion is 

1 + i/3r =2(co S |+/ S in^ - 2e ,rr ’ 3 

Before leaving this topic, let us note an important extension of the result in (16.23). 
Supposing that we have the nth power of a complex number—say, (h + vi)"~ how do we 
write its polar and exponential forms? The exponential form is the easier to derive. Since 
h + vi = Re'", it follows that 


(h + viy ) = (Re" , T=R n e ,n ' 

Similarly, we can write 

(ft - vi) n = (fie"'")' 1 = R n e~ !n ” 

Note that the power n has brought about two changes: (1) R now becomes R n , and (2) ft 
now becomes no. When these two changes are inserted into the polar form in (16.23), we 
find that 


That is, 


(ft ± vi}° = R'\cosn9 ± / sinnft) (16.23') 


[fl(cos ft* = i sin ft)]” = R "(cos no ± i sin no) 


Known as De Moivre's theorem, this result indicates that, to raise a complex number to the 
nth power, one must simply modify its polar coordinates by raising R to the nth power and 
multiplying ft by n. 


EXERCISE 16.2 


1. Find the roots of the following quadratic equations: 

(o) r 2 - 3r + 9 = 0 (c) 2x 2 + x + 8 = 0 

(b) r 2 + 2r + 17 = 0 (d) 2x 2 - x + 1 = 0 

2. (a) How many degrees are there in a radian? 

(b) How many radians are there in a degree? 

3. With reference to Fig. 16.3, and by using Pythagoras's theorem, prove that 


(a) sin 2 0 -ft cos 2 ft == 1 


, n n 1 

(fcjs'n—=cos — — — 


4. By means of the identities (16.14), (16.15), and (16.16), show that: 
(q) sin 20 = 2sintf cosP 
(fc) cos 2/> s 1 - 2 sin 2 9 


(c) sin ( 0 ] + O2) + sin(^i - O2) = 2 sin &-\ cos $2 

(d) 1 + tan 2 ft = —t— 

cos 2 ft 

(e) sin - ft) = cosft ( f ) cos(^ -ft) ^sinft 


5. By applying the chain rule: 

(o) Write out the derivative formulas for — sin f(0) and — cos f (0), where f(0) is a 
function of ft. d6 d( * 

( b ) Find the derivatives of cosft 3 , sin(ft 2 -ft 3ft), cose”, and sin(1/ft), 
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6. From the Euler relations, deduce that: 

(a) r 1 * = -1 (0e h!A = ^0 +i) 

(■ b J e‘”> 1 = 1(1 + Vi/} (d) - -y (1 +1) 


7. Find the Cartesian form of each complex number: 


7T 


(a) 2^cos+ /sin -J (b) (c) V2<r J * /4 

8. Find the polar and exponential forms of the following complex numbers: 

3 3\/3 ~ 

^ 2 + ~ 2 ‘ > (fa)4(v/3 + /) 


16.3 Analysis of the Complex-Root Case _ 

With the concepts of complex numbers and circular functions at our disposal, we arc now 
prepared to approach the complex-root case (Case 3), referred to in Sec. 16.1. You will re¬ 
call that the classification of the three cases, according to the nature of the characteristic 
roots, is concerned only with the complementary function of a differential equation. Thus, 
wc can continue to focus our attention on the reduced equation 

y"(t) + a\v'(f) + 02 V =0 [reproduced from (16.4)] 


The Complementary Function 

When the values of the coefficients a\ and a 2 arc such that a] < 4a;. the characteristic 
roots will be the pair of conjugate complex numbers 


where 


r,.r 2 = h ± vi 


h — —-tfi 
2 


and 


1 I A 1 

^=- v 4<?2-ar 


The complementary function, as was already previewed, will thus be in the form 
v< = e tu (A[e l,r + A 2 e~ vu ) [reproduced from (16.11)] 

Let us first transform the imaginary' exponential expressions in rhe parentheses into 
equivalent trigonometric expressions, so that we may interpret the complementary function 
as a circular function. This may be accomplished by using the Euler relations. Letting 
0 = vt in (16.21) and (16.21'), we find that 

e vir = cos vt — / sin vt and e l?K = cos vt - / sin vt 
From these, it follows that the complementary function in (16.11) can be rewritten as 
y L = e ni [A\(cos l vt + i sinv/) + .^(eosur - / sin vt)} 

= €^[(A \ + A 2 ) cos vt + ( A 1 - A 2)1 sin vt] 


(16.24) 
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Furthermore, if vve employ the shorthand symbols 

A$ = A[ + A 2 and A $ = (A \ — Ai)i 

it is possible to simplify (16.24) inlo T 

y c = e h( {A $ cos vt 4- sin irt) ( 16.24') 

where the new arbitrary constants A$ and A& are later to be definitized. 

If you are meticulous, you may feel somewhat uneasy about the substitution of 9 by vt 
in the foregoing procedure. The variable 9 measures an angle, but vt is a magnitude in units 
of r (in our context, time). Therefore, how can we make the substitution 9 = vt"! The answer 
to this question can best be explained with reference to the unit circle (a circle with radius 
R = 1) in Fig. 16.5. True, we have been using 0 to designate an angle; hut since the angle 
is measured in radian units, the value of 0 is always the ratio of the length of are AB to the 
radius R. When R = 1, we have specifically 

sxcAB arc AB 
9 = —-— = —j— = arc AB 

In other words, 9 is not only the radian measure of the angle, but also the length of the 
arc AB, which is a number rather than an angle. If the passing of time is charted on the 
circumference of the unit circle (counterclockwise), rather than on a straight line as we do 
in plotting a time series, it really makes no difference whatsoever whether w'e consider the 

f The fact that in defining At, we include in it the imaginary number / is by no means an attempt to 
"sweep the dirt under the rug." Because A(, is an arbitrary constant, it can take an imaginary as well 
as a real value. Nor is it true that, as defined, At will necessarily turn out to be imaginary. Actually 
if A] and A2 are a pair of conjugate complex numbers, say, mini, then As and At will both be 
real: 4$ = A] 4 Ai = (m -t- m) + (m - ni) — 2m, and A$ = ( A\ - 42)' = [(m+ m) - (rr? - n/)]/ = 

(2m)/ = - 2 m 
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lapse of time as an increase in the radian measure of the angle 9 or as a lengthening of the 
arc AB. Even if R ^ L, moreover, the same line of reasoning can apply, except that in that 
case 9 will be equal to (arc AB)/R instead; i.e., the angle 9 and the arc AB will bear a fixed 
proportion to each other, instead of being equal. Thus, the substitution 9 = vl is indeed 
legitimate. 


An Example of Solution 

Let us find the solution of the differential equation 

y"(t) + 2y'(t) + 17 v = 34 


with the initial conditions y(ft) = 3 and > '(0) =11. 

Since ai =2,a 2 = 17, and b = 34, we can immediately find the particular integral to be 

b 34 

v,= - = -=2 [by (16.3)] 

ai 17 

Moreover, since a\ = 4 < 4a 2 = 68. the characteristic roots will be the pair of conjugate 
complex numbers (h ± vi). where 



-1 


and 



1 

2 


•Jm — 4 


Hence, by (16.24'), the complementary function is 

y ( . = e~ r {A 5 cos4t + Hosin40 


Combining >y and y,„ the general solution can be expressed as 

y(0 = e - '(4 5 cos4r + ^ t sin4r) + 2 

To definitize the constants As and Aa, we utilize the two initial conditions. First, by 
setting t = 0 in the general solution, we find that 
y(0) = cosO + TfiSinO) +2 

= {A 5 + 0 ) +2 = A s +2 [cost) = 1 ; sin 0 = 0 ] 

By the initial condition y(0) = 3, we can thus specify A? - 1. Next, let us differentiate the 
general solution with respect to r—using the product rule and the derivative formulas 
(16.17) and (16.18) while bearing in mind the chain rule [Exercise 16.2-5]—to find y'(t) 
and then y'( 0 ): 

y'(t) = -e~ f (As cos4f + At, sin 4r) + <T'[/l 5 (- 4 sin 4 r) + 44 6 cos4f] 

so that 

y'(0) = -(A?, cosO + AasinOl A- (-4A^ sin0 A- 4At, cosO) 

=- Mj + 0 ) +(0 4 - 440 = 4 ^-^ 

By the second initial condition y'(0) = 11, and in view that A s = 1, it then becomes clear 
That At, = 3.' The definite solution is, therefore, 

y(r) = p - '(eos4/ + 3sin4i) +2 (16.25) 


T Note that, here, indeed turns out to be a real number, even though we have included the 
imaginary number / in its definition. 
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As before, the y,, component (= 2) can be interpreted as the intertemporal equilibrium 
level ofy, whereas the v, component represents the deviation from equilibrium. Because of 
the presence of circular functions iny c , the time path (16.25) may be expected to exhibit a 
fluctuating pattern. But what specific pattern will it involve? 

The Time Path 

We arc familiar with the paths of a simple sine or cosine function, as shown in Fig. 16.4. 
Now we must study the paths of certain variants and combinations of sine and cosine func¬ 
tions so that we can interpret, in general, the complementary function (16.24') 

y c = e h ‘(As cosii t + /fftSinut) 

and, in particular, the y c component of (16.25). 

Let us first examine the term (A } cos of). By itself, the expression (cose/) is a circular 
function of (w), with period 2n (= 6.2832) and amplitude 1. The period of 2 t means that 
the graph will repeal its configuration every time that (t>/) increases by 2 tt . When / alone is 
taken as the independent variable, however, repetition will occur every lime t increases by 
2n/v t so that with reference to t as is appropriate in dynamic economic analysis we 
shall consider the period of (cos vr) to be 2n/i\ (The amplitude, however, remains at 1.) 
Now, when a multiplicative constant As is attached to (cost;/), it causes the range of 
fluctuation to change from ±1 to ±A b . Thus the amplitude now becomes A$, though 
the period is unaffected by this constant. In short, (^costj/) is a cosine function oft, with 
period 2njv and amplitude A$. By the same token, (/Lsintf) is a sine function oft, 
with period 2n/v and amplitude A< s . 

There being a common period, the sum t/Lcosuf + At, sum) will also display a re¬ 
peating cycle every time / increases by 2 n/v. To show this more rigorously, let us note that 
for given values of As and At, we can always find two constants A and e, such that 

As = A cos s and A b = -Asine 
Thus we may express the said sum as 

As cos i’t + A b sin vt = A cost: cos of - A sin e sin vt 
= i4(cosur cose - sin vt sine) 

= A eos<i;r + e) [by (16.16JJ 

This is a modified cosine function off, with amplitude A and period 2n/v, because every 
time that r increases by 2jr/i?, (u/ + e) will increase by 2n, which will complete a cycle on 
the cosine curve. 

Had vv- consisted only of the expression (As cos vt + A b siim). the implication would 
have been that the time path ofy would be a never-ending, constant-amplitude fluctuation 
around the equilibrium value ofy, as represented by y p . But there is, in fact, also the mul¬ 
tiplicative term e ht to consider. This latter term is of major importance, for. as we shall see, 
it holds the key to the question of whether the time path will converge. 

Il /i > 0, the value of e hl will increase continually as t increases. This will produce a 
magnifying effect on the amplitude ofM 5 cost 1 / + A b sinuf) and cause ever-greater devi¬ 
ations from the equilibrium in each successive cycle. As illustrated in Fig. 16.6a, the time 
path will in this case be characterized by explosive fluctuation. If A - 0. on the other hand. 
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FIGURE 16.6 



Equilibrium 

level 




then e hl = I, and the complementary function will simply be (/ 15 cos i7 + sin vr), 
which lias been shown to have a constant amplitude. In this second case, each cycle will 
display a uniform pattern of deviation from the equilibrium as illustrated by the time path 
in Fig. 16.6ft. This is a time path with uniform fluctuation. Last, if h < 0, the term e Hl will 
continually decrease as t increases, and each successive cycle will have a smaller amplitude 
than the preceding one, much as the way a ripple dies down. This case is illustrated in 
Fig. 16,6c, where the time path is characterized by damped fluctuation. The solution in 
(16.25), with /? = -!, exemplifies this last ease. It should be clear that only the case of 
damped fluctuation can produce a convergent time path; in the other two cases, the time 
path is nonconvergent or divergent} 

In all three diagrams of Fig. 16.6, the intertemporal equilibrium is assumed to be sta¬ 
tionary. If it is a moving one, the three types oflime path depicted will still fluctuate around 
it, but since a moving equilibrium generally plots as a curve rather than a horizontal straight 

1 We shall use the two words nonconvergent and divergent interchangeably, although the latter is 
more strictly applicable to the explosive than to the uniform variety of nonconvergence. 
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line, the fluctuation will take on the nature of, say, a scries of business cycles around a 
secular trend. 

The Dynamic Stability of Equilibrium 

The concept of convergence of the time path of a variable is inextricably lied to the concept 
of dynamic stability of the intertemporal equilibrium of that variable. Specifically, the equi¬ 
librium is dynamically stable if, and only ill the lime path is convergent. The condition for 
convergence of the y(t) path, namely, h < 0 (Fig. 16.6c), is therefore also the condition 
for dynamic stability of the intertemporal equilibrium of'y. 

You will recall that, for Cases 1 and 2 where the characteristic roots are real, the condi¬ 
tion for dynamic stability of equilibrium is that every characteristic root be negative. In the 
present case (Case 3), with complex roots, the condition seems to be more specialized; it 
stipulates only that the real part (h) of the complex roots (h ± vi ) be negative. However, it 
is possible to unify all three cases and consolidate the seemingly different conditions into a 
single, generally applicable one. Just interpret any real root r as a complex root whose 
imaginary part is zero (r = 0). Then the condition “the real part of every characteristic 
root be negative” clearly becomes applicable to all three cases and emerges as the only 
condition we need. 


EXERCISE 16.3 

Find the y p and the y c , the general solution, and the definite solution of each of the 
following: 

1- y"(i) ~ 4y'(t) + 8y = 0; y<0) - 3, y'(0) = 7 

2. y"(f) + 4y'(f) + 8y = 2; y( 0) - y'(0) = 4 

3. y"(t) + 3y’(f) - 4y = 12; y(0) = l, y'(0) = 2 

4. y"(t)-2y'(0-10y = 5; y(0)«6, y'(0) = 8} 

5. y"(t) + 9y = 3; y(0) = 1, y'(0) = 3 

6. 2y"(f)-12y'(t) + 20y = 4Q; y(0) = 4, y'(0) = 5 

7. Which of the differentia! equations in Probs. 1 to 6 yield time paths with (a) damped 
fluctuation; ( b ) uniform fluctuation; (c) explosive fluctuation? 

16.4 A Market Model with Price Expectations _ 

In the earlier formulation of the dynamic market model, both Qj and Q , arc taken to be 
functions of the current price P alone. But sometimes buyers and sellers may base their 
market behavior not only on the current price but also on the price trend prevailing at the 
time, for the price trend is likely to lead Them to certain expectations regarding the price 
level in the future, and these expectations can, in turn, influence their demand and supply 
decisions. 

Price Trend and Price Expectations 

In the continuous-time context, the price-trend information is to be found primarily in the 
two derivatives dP/di (whether price is rising) and d 2 P/dt 2 (whether increasing at an 
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increasing rate). To take the price trend into account, let us now include these derivatives as 
additional arguments in the demand and supply functions; 

Q l! = D[P{t),r'(i),r"(t)\ 

Q, = S[P(r),P'(t),P"(t)] 

If we confine ourselves to the linear version of these functions and simplify the notation for 
the independent variables to P, P\ and P", we can write 

Q^a-fiP + mP' + nP" (a, ft > 0) 
Q^-y+HP+uP' + wP” (y, 5 > 0) 

where the parameters a, /), y, and 5 are merely carryovers from the previous market 
models, but m. n, «, and w arc new. 

The four new parameters, whose signs have not been restricted, embody the buyers' and 
sellers’ price expectations. If m > 0, for instance, a rising price will cause Q d to increase. 
This would suggest that buyers expect the rising price to continue to rise and, hence, prefer 
to increase their purchases now, when the price is still relatively low. The opposite sign for 
m would, on the other hand, signify the expectation of a prompt reversal ofthc price trend, 
so the buyers would prefer to cut back current purchases and wait for a lower price to ma¬ 
terialize later. The inclusion of the parameter n makes the buyers’ behavior depend also on 
the rate of change of dP/dt. Thus the new parameters m and n inject a substantial element 
of price speculation into the model. The parameters u and vv carry a similar implication on 
the sellers’ side of the picture. 


A Simplified Model 

For simplicity, we shall assume that only the demand function contains price expectations. 
Specifically, we let m and n be nonzero, but lot u = w = 0 in (16.26). Further assume that 
the market is cleared at every point of time. Then we may equate the demand and supply 
functions to obtain (after normalizing) the differential equation 


,, m . fi + 5 a + v 

P" + -P>-U—P = —ZJL 

n n n 


(16.27) 


This equation is in the form of (16.2) with the following substitutions; 


v = P 


m 

a\=- 

n 


th = ~ 


D + s 


b = - 


a + y 


Since this pattern of change of P involves the second derivative P" as well as the first 
derivative P', the present model is certainly distinct from the dynamic market model 
presented in Sec. 15,2, 

Mote, however, that the present model differs from the previous model in yet another 
way. In Sec. 15.2, a dynamic adjustment mechanism, dPjdt = j(Qu - £>.,) is present. 
Since that equation implies that dPjdi = 0 if and only if Q,t = Q ,, the intertemporal 
sense and the market-clearing sense of equilibrium are coincident in that model. In con¬ 
trast. the present model assumes market clearance at every moment of time. Thus every 
price attained in the market is an equilibrium price in the market-clearing sense, although 
it may not qualify as the intertemporal equilibrium price. In other words, the two senses 
of equilibrium are now disparate. Note, also, that the adjustment mechanism dPjdt = 
j(Qd - QP, containing a derivative, is what makes the previous market model dynamic. 
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In the present model, with no adjustment mechanism, the dynamic nature of the model 
emanates instead from the expectation terms m P' and n P". 

The Time Path of Price 

The intertemporal equilibrium price of this model -the particular integral P { , (formerly 
v /f )—is easily found by using (16.3). It is 

p _ _ a + r 

a 2 ~ 0+S 

Because this is a (positive) constant, it represents a stationary equilibrium. 

As for the complementary function P c (formerly y,\ there are three possible cases. 


Case 1 (distinct real roots) 


-4 


P + 5 


The complementary function of this case is, by (16.7), 

P i: = A^' 1 + A 2 e r « 

where 


1 

ruri= 7 


-W(" 


Accordingly, the general solution is 


P {l ) = P c + P =A , e nt + A2e rj + a + Y 


P + 5 


Case 2 (double real roots) 

&++) 

In this case, the characteristic roots take the single value 

m 

r “ ~2n 

thus, by (16.9), the general solution may be written as 


P{t) = A-.e m,/2 " + A 4 le- m,/2n + Z±Z 

ft +8 


Case 3 (complex roots) 


-Y < -4(Cti 

n \ n 


(16.28) 


(16.29) 


(16.29') 


In this third and last case, the characteristic roots are the pair of conjugate complex 
numbers 


rj, r 2 = h ± vi 
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where 



Therefore, by (16.24'), we have the general solution 

P(t)=e~ m,l2 \A, cos vi + At, sin w) H- (16.29") 

i i + a 

A couple of general conclusions can be deduced from these results. First, if n > 0. then 
-4(/3 + S)/n must be negative and hence less than ( m/n ) 2 . Hence Cases 2 and 3 can im¬ 
mediately be ruled out. Moreover, with n positive (as are ft and 5), the expression under 
the square-root sign in (16.28) necessarily exceeds (m/n) 2 , and thus the square root must 
be greater than |m/n|.The ± sign in (16.28) would then produce one positive root (r,) and 
one negative root (rj). Consequently, the intertemporal equilibrium is dynamically unsta¬ 
ble, unless the detinitized value of the constant A j happens to be zero in (16.29). 

Second, if n <0, then all three cases become feasible. Under Case 1. we can be sure 
that both roots will be negative if m is negative. (Why?) Interestingly, the repeated root of 
Case 2 will also be negative if m is negative. Moreover, since h, the real part of the complex 
roots in Case 3, takes the same value as the repeated root r in Case 2. the negativity of m 
will also guarantee that h is negative. In short, for all three cases, the dynamic stability of 
equilibrium is ensured when the parameters m and « are both negative. 


Example 1 


Let the demand and supply functions be 

Q d = 42-4P-4P'+P" 

Qi = -6 + 8P 

with initial conditions P(0) = 6 and P'(0) = 4. Assuming market clearance at every point of 
time, find the time path P(t). 

In this example, the parameter values are 

a = 42 fi=4 y = 6 5 = 8 m=-4 n=1 

Since n is positive, our previous discussion suggests that only Case 1 can arise, and that the 
two (real) roots n and r 2 will take opposite signs. Substitution of the parameter values into 
(16.28) indeed confirms this, for 

r,, r 2 = ] -(4 ± VT6 + 48) - ^(4 ± 8) = 6, -2 


The general solution is, then, by (16.29), 

P(t) = A}^ + A 2 e 21 A-A 

By taking the initial conditions into account, moreover, we find that A] = Ai = 1, so the 
definite solution is 

P(t)=e €t + e~ 2t + 4 

In view of the positive root n = 6, the intertemporal equilibrium (P p = 4) is dynamically 
unstable. 
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The preceding solution is found by use of formulas (16.28) and (16.29). Alternatively, we 
can first equate the given demand and supply functions to obtain the differential equation 

P"-4P'-UP = -48 

and then solve this equation as a specific case of (16.2). 

Example 2 Given the demand and supply functions 

Qd = 40 - 2P - 2P 1 - P" 
q s = -5 + 3T 

with P(0) = 12 and P'(0) = 1, find P(t) on the assumption that the market is always 
cleared. 

Here the parameters m and n are both negative. According to our previous general dis¬ 
cussion, therefore, the intertemporal equilibrium should be dynamically stable. To find the 
specific solution, we may first equate Qj and Q> to obtain the differential equation (after 
multiplying through by -1) 

P" + 2P' + 5P =45 

The intertemporal equilibrium is given by the particular integral 



From the characteristic equation of the differential equation, 

r 2 + 2r + 5 = 0 


we find that the roots are complex: 

1 1 

r 1( r 2 = -(-2 ± s/4-20)= j(-2±4») = -1 ±2r 

This means that h = -1 and v = 2, so the general solution is 

P(t) = e~ ! (A 5 cos2f + A 6 sin2t) + 9 

To definitize the arbitrary constants 4 5 and A 6 , we set t — 0 in the general solution, to 
get 

P(0) = e°(4 5 cos0 +A 6 sin0) + 9= A 5 +9 [cos0= 1; sin0 = 0] 

Moreover, by differentiating the general solution and then setting t = 0, we find that 

P'(0 = -r'(A 5 cos2f + Afi sin2t) + e '(-2 A s sin2t + 2A $ cos21) 

[product rule and chain rule] 

and P‘( 0) = -e°(A 5 cosO + AeSinO) + e°(-245sin0-i-246cos0) 

= — (As + 0) + (0 -I- 2Ae) = —As + 2A6 

Thus, by virtue of the initial conditions P(0) = 12 and P'( 0) = 1, we have A 5 = 3 and 
As = 2. Consequently, the definite solution is 

P(f) = e ‘(3cos2f + 2sin2f) + 9 
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This time path is obviously one with periodic fluctuation; the period is 2n/v = ,t. That is, 

there is a complete cycle every time that t increases by jt = 3.14159_in view of the 

multiplicative term e ~ l , the fluctuation is damped. The time path, which starts from the 
initial price P( 0) = 12, converges to the intertemporal equilibrium price P p = 9 in a cyclical 
fashion. 


EXERCtSi 16.4 

1. Let tiie parameters m, n, 4 and w in (16.26) be all nonzero. 

(o) Assuming market clearance at every point of time, write the new differential 
equation of the model. 

(b) Find the intertemporal equilibrium price. 

(c) Under what circumstances can periodic fluctuation be ruled out? 

2. Let the demand and supply functions be as in (16.26), but with u = w = 0 as in the 
text discussion. 

(a) If the market is not always cleared, but adjusts according to 

~ = j(Qo - Os) (/>0) 
at 

write the appropriate new differential equation. 

(b) Find the intertemporal equilibrium price P and the market-clearing equilibrium 
price P*. 

(c) State the condition for having a fluctuating price path. Can fluctuation occur if 
n > 0? 

3. Let the demand and supply be 

Q d = 9 „ p + P‘ + 3P" Q, = -1 -4 P - P' ^5P" 
with P(Q) = 4 and P'(0) = 4. 

(a) Find the price path, assuming market clearance at every point of time. 

( b ) Is the time path convergent? With fluctuation? 

16.5 The Interaction of Inflation and Unemployment 

In this section, we illustrate the use of a second-order differential equation with a macro 
model dealing with the problem of inflation and unemployment. 

The Phillips Relation 

One of the most widely used concepts in the modern analysis of the problem of inflation 
and unemployment is the Phillips relation/ In its original formulation, this relation depicts 
an empirically based negative relation between the rate of growth of money wage and the 
rate of unemployment; 

>*' = /([/) L/"(t/) <0] (1630) 

T A, W. Phillips, "The Relationship Between Unemployment and the Rate of Change of Money Wage 
Rates in the United Kingdom, 1861-1957," Economicc, November 1958, pp. 283-299. 
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where the lowercase letter w denotes the rate of growth of money wage ff (i.e., w = W/W) 
and U is the rate of unemployment. It thus pertains only to the labor market. Later usage, 
however, has adapted the Phillips relation into a function that links the rate of inflation 
(instead of w) to the rate of unemployment. This adaptation may be justified by arguing that 
mark-up pricing is in wide use, so that a positive w, reflecting growing money-wage cost, 
would necessarily carry inflationary implications. And this makes the rate of inflation, like 
w, a function of U. The inflationary pressure of a positive w can, however, be offset by an 
increase in labor productivity, assumed to be exogeneous. and denoted here by T. Specifi¬ 
cally, the inflationary effect can materialize only to the extent that money wage grows i'aster 
than productivity. Denoting the rale of inflation—that is. the rate of growth of the price 
level P —by the lowercase letter p, (p = P/P), wo may thus write 

p = v-T (16.31) 

Combining (16.30) and (16.31), and adopting the linear version of the fiinction f(U), we 
then get an adapted Phillips relation 

p = a — T — fij (a, fi > 0) (16.32) 

The Expectations-Augmented Phillips Relation 

More recently, economists have preferred to use the expcctations-augmented version of the 
Phillips relation 


W = f((J) + g7T (0 < g < 1) (16.30') 

where n denotes the expected rate of inflation. The underlying idea of (16.30 ), as pro¬ 
pounded by the Nobel laureate Professor Friedman. 1 is that if ail inflationary trend has been 
in effect long enough, people are apt to form certain inflation expectations which they then 
attempt to incorporate into their money-wage demands. Thus w should be an increasing 
function of it. Carried over to (16.32). this idea results in the equation 

p = a - T - fij + gx (0 < g < 1 ) (16.33) 

With the introduction of a new variable to denote the expected rate of inflation, it 
becomes necessary to hypothesize how inflation expectations are specifically formed. 1 
Here we adopt the adaptive expectations hypothesis 

d ^-=jip-n) (0 <_/<!) (16.34) 

Note that, rather than explain the absolute magnitude of n, this equation describes instead 
its pattern of change over time. If the actual rate of inflation p turns out to exceed the 
expected rate n, the latter, having now been proven to be too low, is revised upward 
( chr/tdr > 0), Conversely, if p falls short of/r, then ji is revised in the downward direction. 
In format, (16.34) closely resembles the adjustment mechanism dP/dt = j( Qd - Q, ) of 

1 Milton Friedman, "The Role of Monetary Policy," American Economic Review, March 1968, pp. 1-17. 

* This is in contrast to Sec. 16.4, where price expectations were discussed without introducing a new 
variable to represent the expected price. As a result, the assumptions regarding the formation of 
expectations were only implicitly embedded in the parameters m, n, u, and win (16.26). 
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the market model. But here the driving force behind the adjustment is the discrepancy 
between the actual and expected rates of inflation, rather than Qd and . 

The Feedback from Inflation to Unemployment 

It is possible to consider (16.33) and (16.34) as constituting a complete model. Since there 
are three variables in a two-equation system, however, one of the variables has to be taken 
as exogenous. If 7i and p are considered endogenous, for instance, then 6'must be treated 
as exogenous. A more satisfying alternative is to introduce a third equation to explain the 
variable U, so that the model will be richer in behavioral characteristics. More significantly, 
this will provide us with an opportunity to take into account the feedback effect of inflation 
on unemployment. Equation (16.33) tells us how U affects p —largely from the supply side 
of the economy. But p surely can affect U in return, for example, the rate of inflation may 
influence the consumption-saving decisions of the public, hence also the aggregate demand 
for domestic production, and the latter will, in turn, affect the rate of unemployment. Even 
in the conduct of government policies of demand management, the rate of inflation can 
make a difference in their effectiveness, Depending on the rate of inflation, a given level of 
money expenditure (fiscal policy) could translate into varying levels of real expenditure, 
and similarly, a given rale of nominal-money expansion (monetary policy) could mean 
varying rates of real-money expansion. And these, in turn, would imply differing effects on 
output and unemployment. 

For simplicity, we shall only take into consideration the feedback through the conduct of 
monetary policy. Denoting the nominal money balance by M and its rate of growth by 
m = MfM, let us postulate that T 

di ' 

— = —k{m - p) (k > 0) (16.35) 

dt 

Recalling (10.25), and applying it backward, wc see that the expression (hi - p) represents 
the rate of growth of real money: 

M P 

m-p = — --p=r M -r P = r {M/Pi 

Thus (16.35) stipulates that dU/dt is negatively related to the rate of growth of real-money 
balance. Inasmuch as the variable/? now enters into the determination of dU/dt, the model 
now contains a feedback from inflation to unemployment. 

The Time Path of n 

Together. {16.33) through (16.35) constitute a closed model in the three variables *,/>, and 
U. By eliminating two of the three variables, however, we can condense the model into a 
single differential equation in a single variable. Suppose that we let that single variable be 
jr. Then we may first substitute (16.33) into (16.34) to get 

= j{<* - T - pU) -/(l -£)*■ (16.36) 

dt 

1 In an earlier discussion, we denoted the money supply by M s , to distinguish it from the demand tor 
money M<j. Here, we can simply use the unsubscripted letter M, since there is no fear of confusion. 
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Had this equation contained the expression dU/dt instead of J, we could have substituted 
(16.35) into (16,36) directly. But as (16.36) stands, we must first deliberately create a 
dU/dt term by differentiating (16.36) with respect to t, with the result 


d 2 jr 


dt 2 


■R dU vi \ dn 


(16.37) 


Substitution of (16.35) into this then yields 


d 2 n 

di 2 


= jfikm-jt)kp-j( 1-g)^ 


(16.37') 


There is still a p variable to be eliminated. To achieve that, we note that (16.34) implies 

1 (Jit 

p=-~+n (16.38) 

] dt 

Using this result in (16.37'), and simplifying, we finally obtain the desired differential 
equation in the variable ti alone: 

U| do b 


The particular integral of this equation is simply 

b 

ti— =m 
(12 

Thus, in this model, the intertemporal equilibrium value of the expected rale of inflation 
hinges exclusively on the rate of growth of nominal money. 

For the complementary function, the two roots are, as before, 


ri,r 2 = -(-a,±Ja?-4a 2 J (16.39) 

where, as may be noted from (16.37"), both a, and a 2 are positive. On a priori grounds, it 
is not possible to determine whether a] would exceed, equal, or be less than 4a 2 - Thus all 
three cases of characteristic roots—distinct real roots, repeated real roots, or complex 
roots—can conceivably arise. Whichever case presents itself, however, the intertemporal 
equilibrium will prove dynamically stable in the present model. This can be explained as 
follows: Suppose, first, that Case 1 prevails, with af > 4a 2 ,Then the square root in (16.39) 

yields a real number. Since a 2 is positive, Ja] - 4a 2 is necessarily less than Jo] =a\. It 
follows that r\ is negative, as is r 2 , implying a dynamically stable equilibrium. What if 
af = 4a 2 (Case 2)? In that event, the square root is zero, so that r { =n= -a, /2 < 0. And 
the negativity of the repealed roots again implies dynamic stability Finally, lor Case 3. the 
real part of the complex roots is 6 = -a] /2. Since this has the same value as the repeated 
roots under Case 2. the identical conclusion regarding dynamic stability applies. 

Although we have only studied the time path of n, the model can certainly yield infor¬ 
mation on the other variables, too. To find the time path of, say, the U variable, we can 
either start off by condensing the model into a differential equation in J rather than n (see 
Exercise 16.5-2) or deduce the J path from the jt path already found (see Example 1). 



536 Part Five Dynamic Analysis 


Example 1 


Let the three equations of the model take the specific forms 


p= \ - 3U + .V 

0 

(16.40) 

dn 3 . 

*= 4 (p -' T) 

(16.41) 

dU 1 

(16.42) 


Then we have the parameter values ft = 3, h = 1, / = and k = \; thus, with reference to 
(16.37''), we find 

3 $ 9 

a, = 0 Jr +/(1 -$)=- a 2 = jftk= g and b=iftkm=^m 

The particular integral is b/a 2 = m. With of < Aa 2 , the characteristic roots are complex: 


n.C? 


1 / I + 

2 2 V 4 2 


1 

2 


3 3, 
2 1 2 f 


3 

4 


3 

± -i 

4 


That is, h = and v = Consequently, the general solution for the expected rate of 
inflation is 

Jr(f) = e ~ lt/4 ^cos jt 1 A 6 sin^+m (16.43) 

which depicts a time path with damped fluctuation around the equilibrium value m. 

From this, we can also deduce the time paths for the p and U variables. According to 
(16.41), p can be expressed in terms of tt and cfo/cft by the equation 

4 c/.t 
p 3 dt 

The tt path in the general solution (16.43) implies the derivative 

+ e _3r/4 (--As sin -t + - cos-A [product rule and chain rule] 

V 4 4 4 4 / 

Using the solution (16.43) and its derivative, we thus have 

p(£) = e 31/4 (As cqs^£- As sin 4 m (16.44) 


Like the expected rate of inflation n, the actual rate of inflation p also has a fluctuating time 
path converging to the equilibrium value m. 

As for the U variable, (16.40) tells us that it can be expressed in terms of n and p as 
follows: 


By virtue of the solutions (16.43) and (16.44), therefore, we can write the time path of the 
rate of unemployment as 


U(t) = -e 


— 3 f :'4 


(4 5 -A6)cos^t-h(As + 4 6 )sln^t 


1 

' h 18 


(16.45) 
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This path is, again, one with damped fluctuation, with ^ as U, the dynamically stable 
intertemporal equilibrium value of U. 

Because the intertemporal equilibrium values of.T and pare both equal to the monetary- 
policy parameter m, the value of m —the rate of growth of nominal money—provides the 
axis around which the time paths of n and p fluctuate. If a change occurs in m, a new equi¬ 
librium value of 7i and p will immediately replace the old one, and whatever values the .t 
and p variables happen to take at the moment of the monetary-policy change will become 
the initial values from which the new .t and p paths_emanate. 

In contrast, the intertemporal equilibrium value U does not depend on m. According to 
(16.45), U converges to the constant regardless of the rate of growth of nominal money, 
and hence regardless of the equilibrium rate of inflation. This constant equilibrium value of 
U is referred to as the natural rate of unemployment. The fact that the natural rate of unem¬ 
ployment is consistent with any equilibrium rate of inflation can be represented in the Up 
space by a vertical straight line parallel to the p axis. That vertical line relating the equilib¬ 
rium values of U and p to each other, is known as the long-run Phillips curve. The vertical 
shape of This curve, however, is contingent upon a special parameter value assumed in this 
example. When that value is altered, as in Exercise 16.5-4, the long-run Phillips curve may 
no longer be vertical. 


EXERCISE 16.5 


1. In the inflation-unemployment model, retain (16.33) and (16.34) but delete (16.35) 
and let U be exogenous instead. 

(a) What kind of differential equation will now arise? 

(b) How many characteristic roots can you obtain? Is it possible now to have periodic 
fluctuation in the complementary function? 

2. In the text discussion, we condensed the inflation-unemployment model into a differ¬ 
ential equation in the variable .7. Show that the model can alternatively be condensed 
into a second-order differential equation in the variable U, with the same Oi and a 2 
coefficients as in (16.37"), but a different constant term b = kj[a - T - ( 1 - g)m]. 

3. Let the adaptive expectations hypothesis (16.34) be replaced by the so-called perfect 
foresight hypothesis -t = p, but retain (16.33) and (16.35). 

(a) Derive a differential equation in the variable p. 

(b) Derive a differential equation in the variable U. 

(c) How do these equations differ fundamentally from the one we obtained under the 
adaptive expectations hypothesis? 

(d) What change in parameter restriction is now necessary to make the new differen¬ 
tial equations meaningful? 

4. In Example 1, retain (16.41) and (16.42) but replace (16.40) by 

1 1 

P= - - 3U + -7T 

o 3 

(a) Find p(f), jt( 0/ and t?(t). 

(b) Are the time paths still fluctuating? Still convergent? 

(c) What are p and U, the intertemporal equilibrium values of p and U? 

(d) Is it still true that U is functionally unrelated to p? If we now link these two equilib¬ 
rium values to each other in a long-run Phillips curve, can we still get a vertical 
curve? What assumption in Example 1 is thus crucial for deriving a vertical long-run 
Phillips curve? 
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16.6 Differential Equations with a Variable Term _ 

In the differential equations considered in See. 16.1, 

y"(l) +«|v'(0 +fl 2 .V = k 

the right-hand term b is a constant. What if, instead of b, we have on the right a variable 
term: i.e.. some function of t such as bf, e h ', or b sin/? The answer is that we must then 
modify our particular integral y p . Fortunately, the complementary function is not affected 
by the presence of a variable term, because >y deals only with the reduced equation, whose 
right side is always zero. 

Method of Undetermined Coefficients 

We shall explain a method of finding y,. known as the method of undetermined coefficients, 
which is applicable to constant-coefficient variable-term differential equations, as long as 
the variable term and its successive derivatives together contain only a finite number of 
distinct types of expression (apart from multiplicative constants). The explanation of this 
method can best be carried out with a concrete illustration. 


Example 1 


Find the particular integral of 

y v C0 + 5/ (0 + By = 6J 2 - t - 1 (16.46) 

By definition, the particular integral is a value of y satisfying the given equation, i.e., a value 
of y that will make the left side identically equal to the right side regardless of the value of 
t. Since the left side contains the function y(t) and the derivatives y'(t) and y"(t)—whereas 
the right side contains multiples of the expressions t 2 , t, and a constant—we ask: What gen¬ 
eral function form of y(f), along with its first and second derivatives, will give us the three 
types of expression t 2 , t, and a constant? The obvious answer is a function of the form 
6i f 2 +• B 2 t+ (where 8, are coefficients yet to be determined), for if we write the partic¬ 
ular integral as 

y(t)=S 1 t 2 + 8 2 t+B 3 


we can derive 


y'(t) = 2Bit+B 2 and y"(t) = 2Bi (16,47) 

and these three equations are indeed composed of the said types of expression. Substitut¬ 
ing these into (16.46) and collecting terms, we get 

Left side = (3Bi)t 2 + (iQBi + 3 B 2 )t + (2«i f 5B 2 + 3B 5 ) 

And when this is equated term by term to the right side, we can determine the coefficients 
B i as follows: 


Bi s* 2 
B 2 = -7 
B 3 = 10 

Thus the desired particular integral can be written as 

y p = 2£ 2 - 7t + 10 


3fli = 6 
IOBt +3 B 2 = -1 
2B, + 5B; + 3B 2 = -1 
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Example 2 


Example 3 


This method can work only when the number of expression types is finite. (Sec Exer¬ 
cise 16.6-1.) In general, when this prerequisite is met, the particular integral may be taken 
as being in the form of a linear combination of all the distinct expression types contained 
in the given variable term, as well as in all its derivatives. Note, in particular, that a constant 
expression should be included in the particular integral, if the original variable term or any 
of its successive derivatives contains a constant term. 

As a further illustration, let us find the general form for the particular integral suitable for the 
variable term (bsin t). Repeated differentiation yields, in this case, the successive derivatives 
(bcos f), (-hsin t), (-bcos t), (bsin f), etc., which involve only two distinct types of expres¬ 
sion. We may therefore try a particular integral of the form (Si sin t + 82 cos f). 

A Modification 

In certain cases, a complication arises in applying the method. When the coefficient of the 
y term in the given differential equation is aero, such as in 

/'(f)+ 5/(0 = 6f a -f-1 

the previously used trial form for the y p , namely, £/ + Bit + £ 3 , will fail to work. The 
cause of this failure is that, since they(7) term is out of the picture and since only deriva¬ 
tives/(/) and /'(?) as shown in (16.47) will be substituted into the leftside, no £/ term 
will ever appear on the left to be equated to the 6t 2 term on the right. The way out of this 
kind of difficulty is to use instead the trial solution t(B\t 2 + Bit + £3); or if this too fails 
(e,g„ given the equation y"(t) = 6 r 2 -1 - 1}. to use i 2 {B\t 2 + Bit + £ 3 ), and so on. 

Indeed, the same trick may bo employed in yet another difficult circumstance, as is 
i I lustrated in Example 3. 

Find the particular integral of 

y"(f) + 3/(t) - 4y = 2e -4t (16.48) 

Here, the variable term is in the form of e~ At , but all of its successive derivatives (namely, 
- 8 e -4 ', 32e~ 4 ', -128e _4f , etc.) take the same form as well. If we try the solution 

y(t) = Be 4( [with y'(t) = -4Be~ 4! and y''(f) = 16Be' 4 '] 

and substitute these into (16.48), we obtain the inauspicious result that 

Left side = (16 - 12 - 4)8e 41 = 0 (16.49) 

which obviously cannot be equated to the right-side term 2 e~ 41 . 

What causes this to happen is the fact that the exponential coefficient in the variable 
term (-4) happens to be equal to one of the roots of the characteristic equation of (16.48): 

r 2 + 3r - 4 = 0 (roots ri, r 2 = 1, -4) 

The characteristic equation, it will be recalled, is obtained through a process of differentia¬ 
tion^ but the expression (16 - 12 - 4) in (16.49) is derived through the same process. Not 
surprisingly, therefore, (16 - 12 - 4) is merely a specific version of (r 2 + 3r - 4) with r set 
equal to -4. Since -4 happens to be a characteristic root, the quadratic expression 

r 2 + 3r- 4 = 16-12-4 

must of necessity be identically zero. 

1 See the text discussion leading to (16.4"). 
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To cope with this situation, let us try instead the solution 

y(f) = Bte~ 4> 

with derivatives 

y'(t) = 0 - 4t)Be' 41 and = (-8 -i- 

Substituting these into (16.48) will now yield: left side = -5Se _4: . When this is equated to 
the right side, we determine the coefficient to be B = -2/5. Consequently, the desired par¬ 
ticular integral of (16.48) can be written as 


EXERCISE 16.6 

1. Show that the method of undetermined coefficients is inapplicable to the differential 
equation y"(t) - oy'(t) + by = t 1 . 

2. Find the particular integral of each of the following equations by the method of unde¬ 
termined coefficients: 

(a) y"(t) + 2yXt) + y=t (c) y"(t) + y'(r) +2y= e 1 

(fc) y"(t) + 4y'(0 + y = 2t 2 (d) y\i) + y'(t) + iy = sin t 


1 6.7 Higher-Order Linear Differential Equations _ 

The methods of solution introduced in the previous sections are readily extended lo an 
/jth-order linear differential equation. With constant coefficients and a constant term, such 
an equation can be written generally as 

y 1 " , (0 -Mi/'" l} (/) +-- «„-]/(/M-<v„y = b (16.50) 

Finding the Solution 

In this case of constant coefficients and constant term, the presence of the higher deriva¬ 
tives does not materially affect the method of finding the particular integral discussed 
earlier. 

If we try the simplest possible type of solution, y = / wc can see that all the derivatives 
from v'(0 to v </n (t) will be zero; hence (16.50) will reduce to a„k = />, and we can write 

Vn — k = — {a„ + 0) [cf. (16.3)] 

tin 

In case a n = 0, however, we must try a solution of the form y = kt. Then, since y'U) = k, 
all the higher derivatives will vanish, (16.50) can be reduced to u„ j k = b, thereby yielding 
the particular integral 

V = kt = -—t {a ,.=();«„ i^O) [cf. (16.3')] 

tin -1 

l fit happens that a* = = 0 , then this last solution will fail, too; instead, a solution of 

the form y — kt 1 must be tried. Further adaptations of this procedure should be obvious. 
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As for the complementary function, inclusion of the higher-order derivatives in the dif¬ 
ferential equation has the effect of raising the degree of the characteristic equation. The 
complementary function is defined as the general solution of the reduced equation 

y w| (0 + + ■ ■. + +- a n y = 0 (16.51) 

Trying y — Ae rl 0) as a solution and utilizing the knowledge that this implies 
y'{t)=rAe r ‘,y’’{t) = r 2 Ae n , / nl (f) = r n Ae r ',\vc can rewrite (16.51) as 

Ae rl (t 1 ' + a\f n ' + • • • + a„-\r -(- u„) = 0 

This equation is satisfied by any value of r which satisfies the following (nth-dcgrcc poly¬ 
nomial) characteristic equation 

r" + a] r" 1 H- +a n .ir +a„ = 0 (16.5V) 

There will, of course, be n roots to this polynomial, and each of these should be included in 
the general solution of (16.51). Thus our complementary function should in general be in 
the form 


>V = V 1 ' + A 2 e* + ■ ■ ■ + A„e r -' \ = g A,e r ''j 

As before, however, some modifications must be made in case the n roots are not all real 
and distinct. First, suppose that there are repeated roots, say, n = r 2 = . Then, to avoid 

“collapsing," we must write the first three terms of the solutions as A]e n '+ 
A 2 te r[l + A}! 2 e nl [cf. (16.9)]. In case we have r 4 - r { as well, the fourth term must be 
altered to 44 /V 1 ', etc. 

Second, suppose that two of the roots are complex, say, 


r 5 ,r h -h± vi 


then the fifth and sixth terms in the preceding solution should be combined into the fol¬ 
lowing expression: 

e hl (As cosv! + ^osinuf) [cf. {16.24')] 

By the same token, if two distinct pairs of complex roots are found, there must be two such 
trigonometric expressions (with a different set of values of h, i>, and two arbitrary constants 
for each}.* As a further possibility, if there happen to be two pairs of repeated complex 
roots, then we should use e ht as the multiplicative term for one but use !e ht for the other. 
Also, even though h and v have identical values in the repeated complex roots, a different 
pair of arbitrary constants must now be assigned to each. 

Once y p and y ( are found, the general solution of the complete equation (16.50) follows 
easily. As before, it is simply the sum of the complementary function and the particular 
integral: y(0 = >V + )'p- In this general solution, we can count a total of n arbitrary con¬ 
stants. Thus, to definitize the solution, as many as n initial conditions will be required. 

f It is of interest to note that, inasmuch as complex roots always come in conjugate pairs, we can be 
sure of having at least one real root when the differential equation is of an odd order, i.e., when n is 
an odd number. 
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Example 1 


Find the general solution of 

y (4) (t) + 6y"'(f) + 1 4y"(t) + 1 6/(0 -f 8 y = 24 

The particular integral of this fourth-order equation is simply 

24 ^ 

Vp=J = l 

Its characteristic equation is, by (16.51'), 

r 4 +6r 3 +14r 2 + 16r+8 = 0 


which can be factored into the form 

(r+2)(r + 2)(r 2 + 2r+2) = 0 

From the first two parenthetical expressions, we can obtain the double roots n = r 2 = -2, 
but the last (quadratic) expression yields the pair of complex roots r 1 ( r 4 = -1 ±i, with 
h = -1 and v = 1. Consequently, the complementary function is 

y : = A]e~ 2t + A 2 te~ 2i + e '(/I 3 cos r + A 4 sint) 

and the general solution is 

y(t) = A\e~ 2< + A 2 te' 2 ' + e~’(Ai cost + 4 4 sin f) + 3 

The four constants A-\, A 2 , Ai, and A 4 can be definitized, of course, if we are given four 
initial conditions. 

Note that all the characteristic roots in this example either are real and negative or are 
complex and with a negative real part. The time path must therefore be convergent, and 
the intertemporal equilibrium is dynamically stable. 

Convergence and the Routh Theorem 

The solution of a high-degree characteristic equation is not always an easy task. For this 
reason, it should be of tremendous help if we can find a way of ascertaining the conver¬ 
gence or divergence of a time path without having to solve for the characteristic roots. 
Fortunately, there does exist such a method, which can provide a qualitative {though non- 
graphic) analysis of a differential equation. 

This method is to be found in the Roulh theorem,' which states that: 


The real parts of all of the roots of the nth-degree polynomial equation 
U{)r n + 0\f n * + ••• + a fl )f + d n =0 
are negative if and only if Ihe first n of the following sequence of determinants 
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all are positive. 


In applying this theorem, it should be remembered that \u\\ = Q\- Further, it is to be 
understood that we should lake a m = 0 for all m > n. For example, given a third-degree 


1 For a discussion of this theorem, and a sketch of its proof, see Paul A, Samuelson, Foundations of 
Economic Analysis, Harvard University Press, 1947, pp. 429-435, and the references there cited. 
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polynomial equation (n = 3), we need to examine the signs of the first three determinants 
listed in the Routh theorem; for that purpose, we should set « 4 = « 5 = 0. 

The relevance of this theorem to the convergence problem should become self-evident 
when we recall that, in order for the time path _»•(/) to converge regardless of what the ini¬ 
tial conditions happen to be. all the characteristic roots of the differential equation must 
have negative real parts. Since the characteristic equation (16.51') is an mh-degree polyno¬ 
mial equation, with t/ij = 1, the Routh theorem can be of direct help in the testing of con¬ 
vergence. In fact, vve note that the coefficients of the characteristic equation (16.51') are 
wholly identical with those of the given differential equation (16.51). so it is perfectly 
acceptable to substitute the coefficients of (16.51) directly into the sequence of determi¬ 
nants shown in the Routh theorem for testing, provided that we always take n„ — i. 
Inasmuch as the condition cited in the theorem is given on the “if and only if" basis, it 
obviously constitutes a ncccssary-and-suflicient condition. 


Example 2 


Test by the Routh theorem whether the differential equation of Example 1 has a convergent 
time path. This equation is of the fourth order, so n = 4. The coefficient are oo = 1, oi = 6, 
= 14, o 3 = 1 6, Oi = 8, and a s = a 6 = q 7 = 0. Substituting these into the first four deter¬ 
minants, we find their values to be 6, 68, 800, and 6,400, respectively. Because they are all 
positive, we can conclude that the time path is convergent. 


EXERCISE 16.7 

1. Find the particular integral of each of the following: 

(0) y"'(() + 2y"(0 + y'(0 + 2y = & 

(b) y"'(t) + y"(t) + 2y'(t) = -\ 

(c) 3y" r (t) + 9y"(t) = 1 

(d) yW(t) + y"(t) = A 

2. Find the y p and the y c (and hence the general solution) of: 

(o) y "'(t)-2Y"(t)-y‘(t) + 2y = 4 

[Hinbr 3 2r 2 - r +2 = (r - 1)(r + l)(r -2)] 

(b) y"'(t) + 7y"(t) + 15y'(f) + 9y = 0 

[Hint: r* + 7r 2 + ~\5r + 9 = (r -])(r 2 +6r \ 9)] 

(c) y "(t) 4- 6y"(t) +1Oy'(f) - 8y = 8 

[Hint: r l + 6r 2 + 1 Or + 8 = (r - 4 )(r 2 + 2r + 2)] 

3. On the basis of the signs of the characteristic roots obtained in Prob. 2, analyze the 
dynamic stability of equilibrium. Then check your answer by the Routh theorem. 

4. Without finding their characteristic roots, determine whether the following differential 
equations will give rise to convergent time paths: 

(a) y'"(f) - tOy'ff) + 27y : (t) - 18y = 3 

(b) y‘"(t) - 11 y"(f) + 34y'(f) + 24y = 5 

(c) y'"(0 + 4y"{t)-5y'(t)-2y=-2 

5. Deduce from the Routh theorem that, for the second-order linear differential equation 
y"(() - ci y'(t) + a 2 y = b, the solution path will be convergent regardless of initial con¬ 
ditions if and only if the coefficients a\ and ai are both positive. 



Chapter 


Discrete Time: First-Order 
Difference Equations 


In ihe continuous-time context, the pattern of change of a variable y is embodied in the 
derivatives v'(0. etc. The time change involved in these is occurring continuously. 
When time is, instead, taken to be a discrete variable, so that the variable / is allowed to take 
integer values only, the concept of the derivative obviously will no longer be appropriate. 
Then, as we shall sec, the pattern of change of the variable v must be described by so-called 
differences, rather than by derivatives or differentials, of y(t). Accordingly, the techniques 
of differential equations will give way to those of difference equations. 

When we are dealing with discrete time, the value of variable ^ will change only when 
the variable t changes from one integer value to the next, such as from t = I to t = 2. 
Meanwhile, nothing is supposed to happen lo>\ In this light, it becomes more convenient 
to interpret the values of t as referring to periods —rather than points —of time, with / = 1 
denoting period I and t = 2 denoting period 2. and so forth. Then we may simply regard y 
as having one unique value in each lime period. In view of this interpretation, the discrete- 
time version of economic dynamics is often referred to as period analysis. It should be 
emphasized, however, that “period’ 1 is being used here not in the calendar sense but in the 
analytical sense. Hence, a period may involve one extent of calendar lime in a particular 
economic model, but an altogether different one in another. Hvcn in the same model, more¬ 
over, each successive period should not necessarily be construed as meaning equal calen¬ 
dar time. In the analytical sense, a period is merely a length of time that elapses before the 
variable v undergoes a change. 


17.1 Discrete Time, Differences, and Difference Equations 

The change from continuous time to discrete time produces no effect on the fundamental 
nature of dynamic analysis, although the formulation of the problem must be altered. Basi¬ 
cally, our dynamic problem is still to find a time path from some given pattern of change of 
a variable y over time. But the pattern of change should now be represented by the differ¬ 
ence quotient Av/Af, which is the discrete-time counterpart of the derivative dy/dt. 
Recall, however, that t can now take only integer values; thus, when we are comparing the 
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values of)' in two consecutive periods, we must have At = 1. For this reason, the difference 
quotient Ay/At can be simplified to the expression Ay; this is called ihc first difference 
of y. The symbol A, meaning difference, can accordingly be interpreted as a directive to 
take the first difference of (y). As such, it constitutes the discrete-time counterpart of the 
operator symbol d/dt. 

The expression Ay can take various values, of course, depending on which two consec¬ 
utive time periods are involved in the difference-taking (or “differencing"). To avoid ambi¬ 
guity, let us add a lime subscript to y and define the first difference more specifically, as 
follows: 


A» = yj n - (17.1) 

where y, means the value ofy in the rth period, and y, +] is its value in the period immedi¬ 
ately following the rth period. With this symbology, we may describe the pattern of change 
ofy by an equation such as 


Ay, =2 (17.2) 

or 

Ay, = -0. ly, (17.3) 

Equations of this type are called difference equations. Note the striking resemblance 
between the last two equations, on the one hand, and the differential equations dy/dt = 2 
and dy/dt = -0.1 y on the other. 

Even though difference equations derive their name from difference expressions such as 
Ay,, there are alternate equivalent forms of such equations which are completely free of A 
expressions and which are more convenient to use. By virtue of (17.1), we can rewrite 


(17.2) as 


tt+i ~=2 

(17.2') 

or 


Vm-i = y f +2 

(17.2") 

For (17.3), the corresponding alternate equivalent forms are 


* +l -0.9v,=0 

(17.3') 

or 


y r 11 = 0.9v/ 

(17.3") 


The doublc-primc-numbered versions will prove convenient when we are calculating a 
y value from a known y value of the preceding period. In later discussions, however, wc 
shall employ mostly the single-prime-numbered versions, i.e., those of (17.2') and (17.3'). 

It is important to note that the choice of time subscripts in a difference equation is some¬ 
what arbitrary. For instance, without any change in meaning, (17.2') can be rewritten as 
y, - y, i = 2, where (/ - 1) refers to the period which immediately precedes the /th. Or, 
we may express it equivalently asy , + 2 - v, + i = 2. 
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Also, it may be pointed out that, although we have consistently used subscripted y sym¬ 
bols, it is also acceptable tousey(r), y(r + 1 ), and y(t - 1) in their stead. In order to avoid 
using the notation v(f) for both continuous-time and discrete-lime cases, however, we 
shall, in the discussion of period analysis, adhere to the subscript device. 

Analogous to differential equations, difference equations can be either linear or nonlin¬ 
ear, homogeneous or nonhomogeneous, and of the first or second (or higher) orders. Take 
(17,2'} for instance. It can be classified as: ( 1 ) linear, for noy term (of any period) is raised 
to the second (or higher) power or is multiplied by a;; term of another period; ( 2 ) tionho- 
mogencous, since the right-hand side (where there is no y term) is nonzero; and (3) of the 
first order, because there exists only a first difference A y h involving a one-period time lag 
only. (In contrast, a second-order difference equation, to be discussed in Chap. 18, involves 
a two-period lag and thus entails threey terms: y,+ 2 ? v, + \, as well as y,.) 

Actually, (17.2') can also be characterized as having constant coefficients and a constant 
term (=2). Since the con slant-coefficient case is the only one we shall consider, this char¬ 
acterization will henceforth be implicitly assumed. Throughout the present chapter, the 
constant-term feature will also be retained, although a method of dealing with the variable- 
term case will be discussed in Chap. 18. 

Check that the equation (17.3') is also linear and of the first order; but unlike ( 17.2'). it 
is homogeneous. 


17.2 Solving a First-Order Difference Equation _ 

In solving a differential equation, our objective was to find a time path yit). As we know, 
such a time path is a function of time which is totally free from any derivative (or differen¬ 
tial) expressions and which is perfectly consistent with the given differential equation as 
well as with its initial conditions. The time path wc seek from a difference equation is sim¬ 
ilar in nature. Again, it should be a function of t —a formula defining the values of y in 
every time period—which is consistent with the given difference equation as well as with 
its initial conditions. Besides, it must not contain any difference expressions such as Ay, 
(or expressions like y, + | ->■,). 

Solving differential equations is. in the final analysis, a matter ofintegration. How do we 
solve a difference equation? 

Iterative Method 

Before developing a general method of attack, let us first explain a relatively pedestrian 
method, the iterative method— which, though crude, will prove immensely revealing of the 
essential nature of a so-called solution. 

In this chapter we arc concerned only with the first-order case; thus the difference equa¬ 
tion describes the pattern of change ofy between two consecutive periods only. Once such 
a pattern is specified, such as by {17.2"), and once we are given an initial value y„. it is no 
problem to find y, from the equation. Similarly, once yi is found, y 2 will be immediately 
obtainable, and so forth, by repeated application (iteration) of the pattern of change 
specified in the difference equation. The results of iteration will then permit us to infer a 
time path. 
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Example 1 


Example 2 


Example 3 


Find the solution of the difference equation (17.2), assuming an initial value of yo = 15 . To 
carry out the iterative process, it is more convenient to use the alternative form of the 
difference equation (1 7.2"), namely, y f+ i = y, + 2, with y 0 = 1 5. From this equation, we 
can deduce step-by-step that 

yi = Vo + 2 

y 2 = yi + 2 = (y 0 + 2) + 2 - y 0 + 2 ( 2 ) 

Yi = n + 2 = [yo + 2(2)] + 2 = y 0 + 3(2) 


and, in general, for any period f, 

yr = yo + t(2) = 15 + 2f (17.4) 

This last equation indicates the y value of any time period (including the initial period 
f = 0); it therefore constitutes the solution of (1 7.2). 

The process of iteration is crude- it corresponds roughly to solving simple differential 
equations by straight integration—but it serves to point out clearly the manner in which a 
time path is generated. In general, the value of» will depend in a specified way on the 
value ofy in the immediately preceding period (j,_i); thus a given initial value >o will 
successively lead to V|, y->_, .... via the prescribed pattern of change. 

Solve the difference equation (17.3); this time, let the initial value be unspecified and 
denoted simply by yo. Again it is more convenient to work with the alternative version in 
(17.3"), namely, y (+ i = 0 .9y t . By iteration, we have 

yi = 0.9yo 

y 2 =0.9y,=0.9(0.9y 0 )=(0.9)Vo 
y 3 = 0.9y 2 = 0.9(0.9) 2 yo = (0.9) 3 yo 


These can be summarized into the solution 

Yi = (0.?)'yo (17.5) 

To heighten interest, we can lend some economic content to this example. In the simple 
multiplier analysis, a single investment expenditure in period 0 will call forth successive 
rounds of spending, which in turn will bring about varying amounts of income increment 
in Succeeding time periods. Using y to denote income increment, we have yo = the amount 
of investment in period 0; but the subsequent income increments will depend on the 
marginal propensity to consume (MPC). If MPC = 0.9 and if the income of each period 
is consumed only in the next period, then 90 percent of yo will be consumed in period 1, 
resulting in an income increment in period 1 of yi = 0.9yo. By similar reasoning, we can 
find yz = 0.9yi, etc. These, we see, are precisely the results of the iterative process cited 
previously. In other words, the multiplier process of income generation can be described by 
a difference equation such as (17.3"), and a solution like (17.5) will tell us what the magni¬ 
tude of income increment is to be in any time period f. 

Solve the homogeneous difference equation 

myt+ 1 - ny t = 0 
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Upon normalizing and transposing, this may be written as 


/t+i = 


n 

m 


Yt 


which is the same as (17.3") in Example 2 except for the replacement of 0.9 by n/m. Hence, 
by analogy, the solution should be 


Yt 


n 

m 


7o 


Watch the term y—j , It is through this term that various values of f will lead to their 

corresponding values of y. It therefore corresponds to the expression e rt in the solutions to 
differential equations. If we write it more generally as b ( (b for base) and attach the more 
general multiplicative constant A (instead of y 0 ), we see that the solution of the general 
homogeneous difference equation of Example 3 will be in the form 

Yi = Ab' 

We shall find that this expression Ab‘ will play the same important role in difference equa¬ 
tions as the expression Ae a did in differential equations, 1 However, even though both are 
exponential expressions, the former is to the base b, whereas the latter is to the base e. It 
stands to reason that, just as the type of the continuous-time path y(f) depends heavily on 
the value of r, the discrete-time path y { hinges principally on the value of b. 


General Method 

By this time, you must have become quite impressed with the various similarities between 
differential and difference equations. As might be conjectured, the general method of solu¬ 
tion presently to be explained will parallel that for differential equations. 

Suppose that we are seeking the solution to the first-order difference equation 

vv+i + ay, = c (17.6) 

where a and c are two constants. The general solution will consist of the sum of two com¬ 
ponents: a particular solution y p , which is any solution of the complete nonhomogeneous 
equation (17.6), and a complementary: function y c _ which is the general solution of the 
reduced equation of (17.6): 

y, + \+ay,= 0 (17.7) 

The y p component again represents the intertemporal equilibrium level ofy, and the vy 
component, the deviations of the time path from that equilibrium. The sum of \\ and 
constitutes th z general solution, because of the presence of an arbitrary constant. As before, 
in order to definitize the solution, an initial condition is needed. 

Let us first deal with the complementary function. Our experience with Example 3 
suggests that we may try a solution of the form y, — Ah' (with Ah' f 0, for otherwise y, 
will turn out simply to be a horizontal straight line lying on the / axis); in that case, we also 


f You may object to this statement by pointing out that the solution (17.4) in Example 1 does not 
contain a term in the form of Ab‘. This latter fact, however, arises only because in Example 1 we have 
b = n/m =1/1 = 1, so that the term Ab ! reduces to a constant. 
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have»+] = Ab l+i . If these values of v, and j ',+1 hold, the homogeneous equation (17.7) 
will become 

Ab’ +] + aAb r = 0 

which, upon canceling the nonzero common factor Ab 1 , yields 

b + a = 0 or h = -a 

This means that, for the trial solution to work, we must set h — -a: then the complemen¬ 
tary function should be written as 

y r (= Ab') = A(-a)‘ 

Now' let us search for the particular solution, which has to do with the complete equa¬ 
tion (17.6). In this regard, Example 3 is of no help at all, because that example relates only 
to a homogeneous equation. However, we note that lor vy we can choose any solution of 
(17.6); thus if a trial solution of the simplest form y, = k (a constant) can work out. no real 
difficulty will be encountered. Now, if vy = k. theny will maintain the same constant value 
over time, and we must have y, .\=k also. Substitution of these values into (17.6) yields 

k + ak = c and k = - 

1 +a 

Since this particular k value satisfies the equation, the particular integral can be written as 

>’,,(= k) = —— (a ^ — 1) 

This being a constant, a stationary equilibrium is indicated in this case. 

If it happens that a = —1, vis in Example I, however, the particular solution c/( 1 +«) is 
not defined, and some other solution of the nonhomogeneous equation (17.6) must be 
sought. In this event, we employ the now-familiar trick of trying a solution of the form 
y, =ki. This implies, of course, that y r _i =k(r+ I). Substituting these into (17.6). wefind 

kit + 1 ) +ukt = c and k — --- = c [because a = - 1 ] 

i + l+at 

thus y p {= ki) — ct 

This form of the particular solution is a nonconstant function of /; it therefore represents a 
moving equilibrium. 

Adding y< and y p together, w'c may now' write the general solution in one of the tw '0 
following forms; 

v, = A(-a)' H-— [general solution, case of a / - 1 ] (17.8) 

1 A-a 

y, = A(-a) 1 ■+ ct = A + ct [general solution, case of a = -1] (17.9) 

Neither of these is completely determinate, in view' of the arbitrary constant,-!. To eliminate 
this arbitrary constant, we resort to the initial condition that y, - yt> when t = 0. Letting 
/ = 0 in (17.8), we have 
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Example 4 


Consequently, the definite version of (17.8) is 

r, = ( t'd - — ) (-a)' -[definite solution, case of a ^ - I] (17,8') 

V 1 + a} I -Fa 

Letting / = 0 in (17.9), on the other hand, we find t ; o = A , so the definite version of 
(17.9) is 

y, = Vy + ct [definite solution, case of a = - 1 ] (17.9') 

If this last result is applied to Example 1, the solution that emerges is exactly the same as 
the iterative solution (17.4). 

You can check the validity of each of these solutions by the following two steps. First, hy 
letting i = 0 in (17.8'), see that the latter equation reduces to the identity y 0 = y 0> signify¬ 
ing the satisfaction of the initial condition. Second, by substituting the y, formula (17.8') 
and a similar jvh formula—obtained by replacing / with (/ + 1) in (17.8') —into (17.6), see 
that the latter reduces to the identity c = r, signifying that the lime path is consistent with 
the given difference equation. The check on the validity of solution (17,9') is analogous. 

Solve the first-order difference equation 


Yt+i - 5yt = 1 (h> = ;) 

Following the procedure used in deriving (17.8'), we can find y c by trying a solution 
y t = Ab’ (which implies y (+ 1 = 4b f_1 ). Substituting these values into the homogeneous 
version y r _i - 5y t = 0 and canceling the common factor Ab 1 , we get b = 5. Thus 

Yc = A(sy 

To find y p , try the solution y t = k, which implies y t +1 = k. Substituting these into the 
complete difference equation, we find k = -f . Hence 

Yp ~ “4 

It follows that the general solution is 

yt = Yc + yp = A(S) l -\ 

Letting t = 0 here and utilizing the initial condition yo = \, we obtain 4 = 2. Thus the 
definite solution may finally be written as 

y< = 2 ( 5 )' -1 

Since the given difference equation of this example is a special case of (17.6), with 
o = -5, c= 1, and yo = and since (17.8') is the solution "formula" for this type of 
difference equation, we could have found our solution by inserting the specific parameter 
values into (17.8'), with the result that 

»=(™)® ,+ vh =2 ®'-i 

which checks perfectly with the earlier answer. 

Note that the y,+j term in (17.6) has a unit coefficient. Jf a given difference equation 
has a non unit coefficient for this term, it must be normalized before using the solution 
formula (17.8'). 
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EXERCISE 17.2 

1. Convert the following difference equations into the form of (17.2"): 

(a) Ay t = 7 

(b) = 0.3/; 

(c) Ay; = 2y t - 9 

2. Solve the following difference equations by iteration: 

(o) y !4 i = y?-l (ft = 10) 

(fa)y f ii=oyi (yo = P) 

(c) y £ -i =ay,-fi (ft = yo when t = 0) 

3. Rewrite the equations in Prob. 2 in the form of (17.6), and solve by applying formula 
(17.8') or (17,9’), whichever is appropriate. Do your answers check with those 
obtained by the iterative method? 

4. For each of the following difference equations, use the procedure illustrated in the 
derivation of (17.8') and (17.9') to find y Cl y p , and the definite solution: 

(o) +3y t = 4 (yo = 4) 

(b) 2yy T i - y t = 6 (yb = 7) 

(C) y ( _i = 0.2y; + 4 (y 0 = 4) 


17.3 The Dynamic Stability of Equilibrium _ 

In the continuous-time case, the dynamic stability of equilibrium depends on the Ae” term 
in the complementary function. In period analysis, the corresponding role is played by the 
AW term in the complementary function. Since its interpretation is somewhat more com¬ 
plicated than Ae r! , let us try to clarify it before proceeding further. 

The Significance of b 

Whether the equilibrium is dynamically stable is a question of whether or not the comple¬ 
mentary function w ill tend to zero as r -f oc. Basically, we must analyze the path of the 
term Ab‘ as t is increased indefinitely. Obviously, the value of b (the base of this exponen¬ 
tial term) is of crucial importance in this regard. Let us first consider its significance alone, 
by disregarding the coefficient A (by assuming ,4 = 1). 

For analytical purposes, we can divide the range of possible values o 17). (-oc. +oo). 
into seven distinct regions, as set forth in the first two columns of Table 17.1. arranged in 
descending order of magnitude of b. These regions are also marked off in Fig. 17.1 on a 
vertical b scale, with the points +1,0. and -1 as the demarcation points, In fact, these lat¬ 
ter three points in themselves constitute the regions II, IV, and VI. Regions HI and V on the 
other hand, correspond to the set of all positive fractions and the set of all negative frac¬ 
tions, respectively. The remaining two regions, I and VII, are where the numerical value of 
b exceeds unity. 

In each region, the exponential expression b 1 generates a different type of time path. 
These are exemplified in Table 17.1 and illustrated in Fig. 17.1. In region I (where b > 1). 
b' must increase with t at an increasing pace. The general configuration of the time path 
will therefore assume the shape of the top graph in Fig. 17.1. Note that this graph is shown 
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TABLE 17.1 

A Classification 
of the Values 
of b 


Value of b* in Different Time Periods 


Region 

Value of b 




Value of b 1 

f=0 

t= 1 

2 

t=3 

ts4- 

1 

b> 1 

(Ibl 

> 

1) 

e.g„ (2) 1 

1 

2 

4 

8 

16 

II 

b= 1 

(|b| 

— 

D 

(D f 

1 

1 

1 

1 

1 

III 

0 < b < 1 

(|f>l 

< 

1) 

e -9-' 0/ 

1 

1 

2 

1 

4 

i 

a 

i 

Ji 

IV 

b= 0 

m 

= 

0) 

(0)' 

f 

0 

0 

0 

0 

0 

V 

-1 < b< 0 

(|b| 

< 

1) 

^H) 

1 

1 

2 

4 

\ 

i 

i 

AZ 

VI 

b=-1 

(Ibl 

— 

1) 

(-iy 

1 

-1 

1 


1 

VII 

b< -1 

(|b| 

> 

D 

e.g., (-2)' 

1 

-2 

4 

-8 

16 


as a step function rather than as a smooth curve; this is because we are dealing with period 
analysis. In region II (h = 1), b’ will remain at unity for all values of !. Its graph will thus 
be a horizontal straight line. Next, in region III, b : represents a positive fraction raised to 
integer powers. As the power is increased, b 1 must decrease, though it will always remain 
positive. The next case, that of b = 0 in region IV. is quite similar to the case of b = 1; but 
here we have b 1 = 0 rather than b' = I. so its graph will coincide with the horizontal axis. 
However, this case is of peripheral interest only, since we have earlier adopted the assump¬ 
tion that Ah' ± 0. 

When we move into the negative regions, an interesting new phenomenon occurs: The 
value of b’ will alternate between positive and negative values from period to period! This 
fact is clearly brought out in the last three rows of Tabic 17.1 and in the last three graphs of 
Fig. 17.1. In region V, where A is a negative fraction, the alternating time path tends to get 
closer and closer to the horizontal axis (cf. the positivc-Jraction region. III). In contrast, 
when b = — 1 (region VI). a perpetual alternation between — I and — I results. And finally, 
when b < - 1 (region VII). the alternating time path will deviate farther and farther from 
the horizontal axis. 

What is striking is that, whereas the phenomenon of a fluctuating time path cannot pos¬ 
sibly arise from a single Ae n term (the complex-root case of the second-order differential 
equation requires a pair of complex roots), fluctuation can he generated by a single b' 
(or Ah') term. Note, however, that the character of the fluctuation is somewhat different; 
unlike the circular-function pattern, the fluctuation depicted in Fig. 17.1 is nonsmooth. 
For this reason, we shall employ the word oscillation to denote the new, nonsmooth type 
of fluctuation, even though many writers do use the terms fluctuation and oscillation 
interchangeably. 

The essence of the preceding discussion can be conveyed in the following general state¬ 
ment: The lime path of />' (b 4 0) will be 


Nonoseillatory 
Oscillatory 

Divergent 
Convergent 



b > 0 
/> < 0 


It is important to note that, whereas the convergence ofthc expression e rl depends on the sign 
of r, the convergence of the // expression hinges, instead, on the absolute value of b. 
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FIGURE 17.1 Val u e of b R cgion Con 1 1 g urati on o f h 1 



The Role of A 

So far we have deliberately left out the multiplicative constant A. But its effects—of which 
there are two—are relatively easy to take into account. First, the magnitude of A can serve 
to “blow up” (if, say, A = 3) or “pare down” (if, say, A - i) the values of //.That is, it can 
produce a scale effect without changing the basic configuration of the time path. The sign of 
A, on the other hand, does materially affect the shape of the path because, if // is multiplied 
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by A = -1, then each time path shown in Fig. 17.1 will be replaced by its own mirror 
image with reference to the horizontal axis. Thus, a negative A can produce a mirror effect 
as well as a scale effect. 


Convergence to Equilibrium 

The preceding discussion presents the interpretation of the Ab‘ term in the complementary 
function, which, as we recall, represents the deviations from some intertemporal equilib¬ 
rium level. If a term (say)>y = 5 is added to the Ab ! term, the time path must be shifted up 
vertically by a constant value of 5. This will in no way affect the convergence or divergence 
of the time path, but it will alter the level with reference to which convergence or diver¬ 
gence in gauged. What Fig. 17,1 pictures is the convergence (or lack of it) of the Ah' 
expression to zero. When the y p is included, it becomes a question of the convergence of 
the time path y t - y c A- y p to the equilibrium level y p . 

In this connection, let us add a word ofexplanation for the special easeof b = I (region II). 
A time path such as 

>, - .4(1)' +y p = A+y p 

gives ihe impression that it converges, because the multiplicative term (1)' = 1 produces 
no explosive effect. Observe, how-ever, that ,vy will now- take the value (A -f v ; , ) rather than 
the equilibrium value vy; in fact, it can never reach y p (unless A = 0). As an illustration of 
this type of situation, we can cite the time path in (17.9), in which a moving equilibrium 
v t , = cl is involved. This time path is to be considered divergent, not because of the 
appearance of t in the particular solution but because, with a nonzero A, there will be a con¬ 
stant deviation from the moving equilibrium. Thus, in stipulating the condition for conver¬ 
gence of time path y, to the equilibrium y p . wc must rule out the case of b = 1. 

In sum, the solution 

}’i = Ab' + y p 

is a convergent path if and only if |f>| < 1. 


Example 1 


What kind of time path is represented by y t = 2(- f)' + 9? Since b = - ■ < 0, the time path 
is oscillatory. But since |b| = \ < 1, the oscillation is damped, and the time path converges 
to the equilibrium level of 9. 


You should exercise care not to confuse 2(- 5)' with —2(|)'; they represent entirely dif¬ 
ferent time-path configurations. 


Example 2 


How do you characterize the time path y : = 3(2)' + 4? Since b = 2 > 0, no oscillation will 
occur. But since |b| = 2 > 1, the time path will diverge from the equilibrium level of 4. 


EXERCISE 173 

1. Discuss the nature of the following time paths: 
(a) y, = 3‘-t-1 (c) y ( =5(-^) +3 

= (d)y t = -3(l) ! + 2 


Chapter 1 7 Dixrrvte Time ■ Fim-Order Equations 555 


2. What is the nature of the time path obtained from each of the difference equations in 
Exercise 17.2-4? 

3. Find the solutions of the following, and determine whether the time paths are oscilla¬ 
tory and convergent: 

(a) Yt-ri -\Y( = 6 (K> = 1) 

(b) y (+} +2yt = 9 {y 0 =4) 

(c) /t + t + ?y( = 5 (Ko - 2) 

W) Ym ~Yt-i (Yo = 5) 


17.4 The Cobweb Model 


To illustrate the use of first-order difference equations in economic analysis, we shall cite 
two variants of the market model for a single commodity. The first variant, known as the 
cobweb model, differs from our earlier market models in that it treats as a function not 
of the current price but of the price of the preceding time period. 

The Model 

Consider a situation in which the producer's output decision must be made one period in 
advance of the actual sale—such as in agricultural production, where planting must pre¬ 
cede by an appreciable length of time the harvesting and sale of the output. Let us assume 
that the output decision in period i is based on the then-prevailing price P, Since this 
output will not be available for the sale until period (r 4- 1), however. P, will determine 
not (?„, but Q,j -\. Thus we now have a "lagged” supply function. 1 

&., + != -TO 

or, equivalently, by shifting back the time subscripts by one period. 

Q st = S(P t i) 

When such a supply function interacts with a demand function of the form 

Qdr = »(P') 

interesting dynamic price patterns will result. 

Taking the linear versions of these (lagged) supply and (unlagged) demand functions, 
and assuming that in each time period the market price is always set at a level which clears 
the market, we have a market model with die following three equations: 

Qdi — Qn 

Q dl = a-fiP, (a,ft>Q) (17.10) 

Q.u = -y + HPi -i (y. <5 > 0) 


1 We are making the implicit assumption here that the entire output of a period will be placed on the 
market, with no part of it held in storage. Such an assumption is appropriate when the commodity in 
question is perishable or when no inventory is ever kept. A model with inventory will be considered 
in Sec. 17.5. 
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By substituting the last two equations into the first, however, the model can be reduced to a 
single first-order difference equation as follows: 


pP, + hP,.\ = ct + y 


In order to solve this equation, it is desirable first to normalize it and shift the time sub- 
seripts ahead by one period [alter t to {/ + 1), etc.]. The result. 


<5 a + V 

p -‘ + J p ' = -T- 

will then be a replica of (17.6), with the substitutions 


(17.11) 


D * A a+y 

y = P a = - and c= - 

7 P P 

Inasmuch as S and fi are both positive, it follows that a ^ - 1. Consequently, we can apply 
formula (17.8'), to get the time path 


Po- 


a -h y \ ( &\ f , a + y 

J+s)\p) + 7+& 


(17.12) 


where P<, represents the initial price. 


The Cobwebs 

Three points may be observed in regard to this time path, in the first place, the expression 
{a + y)/(p + 5), which constitutes the particular integral of the difference equation, can 
be taken as the intertemporal equilibrium price of the model:' 

77_^ T y 

P+8 

Because this is a constant, it is a stationary equilibrium. Substituting P into our solution, 
we can express the time path P, alternatively in the form 

p, = (P„-P)(-jj + r 07.12') 

This leads us to the second point, namely, the significance of the expression (Po - P). 
Since this corresponds to the constant A in the Ab‘ term, its sign will bear on the question 
of whether the time path will commence above or below the equilibrium (mirror effect), 
whereas its magnitude will decide how far above or below (scale effect). Lastly, there is the 
expression {-i/P), which corresponds to the b component of Ab 1 . From our model spec¬ 
ification that p,8 > 0. wc can deduce an oscillatory time path. It is this fact which gives 
rise to the cobweb phenomenon, as we shall presently see. There can, of course, arise three 


f As far as the market-clearing sense of equilibrium is concerned, the price reached in each period is 
an equilibrium price, because we have assumed that Q<* - Q« for every f. 
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FIGURE 17.2 


8>0 



S<ji 



possible varieties of oscillation patterns in the model. According to Tabic 17.1 or Fig. 17.1, 
the oscillation will be 


Explosive 

Uniform 

Damped 



where the term uniform oscillation refers to the type of path in region VI. 

In order to visualize the cobwebs, let us depict the model (17.10) in Fig. 17.2. The sec¬ 
ond equation of (17.10) plots as a downward-sloping linear demand curve, with its slope 
numerically equal to fi. Similarly, a linear supply curve with a slope equal to <*> can be drawn 
from the third equation, if we let the Q axis represent in this instance a lagged quantity sup¬ 
plied. The case of $ > f ( S steeper than D) and the case of <5 < fi (S flatter than D) are 
illustrated in Fig. 17.2a and b, respectively. In either ease, however, the intersection of I) 
and <$' will yield the intertemporal equilibrium price P. 

When 5 > fi, as in Fig. 17.2a, the interaction of demand and supply will produce an 
explosive oscillation as follows. Given an initial price P 0 (here assumed above P), we can 
follow the arrowhead and read off on the S curve that the quantity supplied in the next 
period (period 1) will be Q\. In order to clear the market, the quantity demanded in period 
1 must also be Q \, which is possible if and only if price is set at the level of P\ (see down¬ 
ward arrow). Now, via the S curve, the price Pi will lead to Qi as the quantity supplied in 
period 2, and to clear the market in the latter period, price must be set at the level of Pi 
according to the demand curve. Repeating this reasoning, we can trace out the prices and 
quantities in subsequent periods by simply following the arrowheads in the diagram, 
thereby spinning a “cobweb" around the demand and supply curves. By comparing the 
price levels, P ( >, P\> P 2 ,.... we observe in this case not only an oscillatory pattern of 
change but also a tendency for price to widen its deviation from P as time goes by. 
With the cobweb being spun from inside out, the time path is divergent and the oscillation 
explosive. 
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By way of contrast, in the case of Fig. 17.2b. where 5 < ft. the spinning process will 
create a cobweb which is centripetal. From P,„ if we follow the arrowheads, we shall be 
led ever closer to the intersection of the demand and supply curves, where P is. While still 
oscillatory, this price path is convergent. 

In Fig. 17.2 we have not shown a third possibility, namely, that of 8 = ft. The procedure 
of graphical analysis involved, however, is perfectly analogous to the other two cases. It is 
therefore left to you as an exercise. 

The preceding discussion has dealt only with the time path of P (that is, P r ): after P, is 
found, however, it takes but a short step to get to the time path of Q. The second equation 
of (17.10) relates Q t/I to P,. so if (17.12) or (17.12') is substituted into the demand equa¬ 
tion, the time path of 0,i, can be obtained immediately. Moreover, since (P, must be equal 
to in each time period (clearance of market), wc can simply refer to the time path as Q, 
rather than Qj,. On the basis of Fig. 17.2. the rationale of this substitution is easily seen. 
Each point on the D curve relates a P; to a Q. pertaining to the same time period; therefore, 
the demand function can serve to map the time path of price into the time path of quantity. 

You should note that the graphical technique of Tig. 17.2 is applicable even when the D 
and .S’ curves are nonlinear. 


EXERCISE 17.4 

1. On the basis of (17.10), find the time path of 0, and analyze the condition for its 
convergence. 

2. Draw a diagram similar to those of Fig, 17,2 to show that, for the case of 8 = ft, the 
price will oscillate uniformly with neither damping nor explosion. 

3. Given demand and supply for the cobweb model as follows, find the intertemporal 
equilibrium price, and determine whether the equilibrium is stable: 

(o) Q dr = 18-3P, Qji = -3 +4P r _i 

(b) Q d( = 22-3P i Qse = -2 + Pt-i 

(c) Q dr = 19-6P, Q 3( = 6 P,..i-5 

4. In model (17.10), let the Q df = Q f , condition and the demand function remain as they 
are, but change the supply function to 

Qs^-y-SP,* 

where P’ denotes the expected price for period t. Furthermore, suppose that sellers 
have the “adaptive" type of price expectation:’ 

(0 < </ < 1 ) 

where ?? (the Greek letter eta) is an expectation-adjustment coefficient. 

(o) Give an economic interpretation to the preceding equation, in what respects is it 
similar to, and different from, the adaptive expectations equation (16.34)? 

(b) What happens if n takes its maximum value? Can we consider the cobweb model 
as a special case of the present model? 


’ See Nerlove, "Adaptive Expectations and Cobweb Phenomena," Quarterly journol of 
Economics, May 1958, pp. 227-240. 
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(c) Show that the new model can be represented by the first-order difference equation 

P M -( l-n-’f) = ^ 


(Hint: Solve the supply function for P*, and then use the information that 
Qi r = Qdt = a - 

(d) Find the time path of price. Is this path necessarily oscillatory? Can it be oscillatory? 
Under what circumstances? 


(e) Show that the time path P c , if oscillatory, will converge only if 1 - 2 jr\ < S/p. As 
compared with the cobweb solution (17.12) or (17.12'), does the new model have 
a wider or narrower range for the stability-inducing values of -<5/0? 

5. The cobweb model, like the previously encountered dynamic market models, is essen- 
tially based on the static market model presented in Sec. 3.2. What economic assump¬ 
tion is the dynamizing agent in the present case? Explain. 


17.5 A Market Model with Inventory _ 

In the preceding model, price is assumed to be set in such a way as to dear the current out¬ 
put of every time period. The implication of that assumption is either that the commodity 
is a perishable which cannot be stocked or that, though it is stockable, no inventory is ever 
kept. Now we shall construct a model in which sellers do keep an inventory of the 
commodity. 

The Model 

Let us assume the following: 

1. Both the quantity demanded, and the quantity currently produced, are 
unlagged linear functions of price P,. 

2. The adjustment of price is effected not through market clearance in every period, hut 
through a process of price-setting by the sellers: At the beginning of each period, the 
sellers set a price for that period after taking into consideration the inventory situation. 
If, as a result of the preceding-period price, inventory accumulated, the current-period 
price is set at a lower level than before, in order to “move 1 ’ the merchandise; but if 
inventory decumulated instead, the current price is ser higher than before. 

3. The price adjustment made from period to period is inversely proportional to the 
observed change in the inventory (stock). 

With these assumptions, we can write the following equations: 

ftr = -y+^ (M>0> 07.13) 

/Vm - Pi ~ o\Q s i ~ Qji) > 0) 

where a denotes the stockHnduced-prke-adjimmentcotfticxtril Note that (17.13) is really 
nothing but the discrete-time counterpart of the market model of Sec. 15.2, although wc 
have now couched the price-adjustment process in terms of inventory {Q s( - Q (tl ) rather 
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than excess demand ( Q d1 - Q„ ). Nevertheless, the analytical results will turn out to be 
much different; for one thing, with discrete time, we may encounter the phenomenon of 
oscillations. Let us derive and analyze the lime path P,. 


The Time Path 

By substituting the first two equations into the third, the model can be condensed into a 
single difference equation: 

-rr($ + h))P : =o{ a -y) (17.14) 

and its solution is given by (17.8'): 


= (P {) -P)[\-a(P+8)}'+P 


(17.15) 


Obviously, therefore, the dynamic stability of the model will hinge on the expression 
] _ + 5 ); for convenience, let us refer to this expression as b. 

With reference to Table 17.1, we see that, in analyzing the exponential expression h 1 . 
seven distinct regions of b values may be defined. However, since our model specifications 
(cr, fi y & > 0 } have effectually ruled out the first two regions, there remain only five possi¬ 
ble cases, as listed in Tabic 17.2. For each of these regions, The h specification of the second 
column can be translated into an equivalent o specification, as shown in the third column. 
For instance, for region III, the h specification is 0 < b < 1: therefore, we can wri ic 

0 < I -o(f)+&) < 1 

-1 <-o(f) + S) <0 [subtracting 1 from all three parts] 

and —— > o > 0 [dividing through by -(/i t <5)] 

jfl + i 


TABLE 17.2 
Types of Time 
Path 

Region 

Value of 
&-l-a(/S + S) 

Value of a 

Nature of Time Path P t 


111 

0 < b < 1 

0< ° < P + S 

Nonoscillatory and convergent 


IV 

b = 0 


Remaining in equilibrium 1 


V 

-1 < b < 0 

1 2 

- < 0- <; - 

p + s p + s 

With damped oscillation 


VI 

b=~1 

2 

a ~ p+s 

With uniform oscillation 


Vil 

f><-1 

1 

a > p + s 

With explosive oscillation 


1 TTie fact tan price will be remaining in equilibrium in this case can ahn be seen direedy from (17.14). Withers 1/(0+ 3), the 
coefficient of P { becomes zero, .and (17,14) reduces to f f4>J = via + y)» (a + y}/($ + S) = P. 
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This last gives us the desired equivalent a specification for region III The translation for 
the other regions may be carried out analogously. Since the type of time path pertaining to 
each region is already known from Fig. 17.1, the o specification enables us to tell from 
given values of a, 0. and 3 the general nature of the time path P,, as outlined in the last col¬ 
umn of Tabic 17.2. 

Example 1 ^ se ^ ers ' n our m °del always increase (decrease) the price by 10 percent of the amount 

--—” of the decrease (increase) in inventory, and if the demand curve has a slope of -1 and the 

supply curve a slope of 15 (both slopes with respect to the price axis), what type of time 
path P t will we find? 

Here, we have a = 0.1, 0 = 1, and <1 = 15. Since 1/(0 + S) = ^ and 2/(0 + fj) = f, the 
value of a (= ^j) lies between the former two values; it is thus a case of region V. The time 
path P ( will be characterized by damped oscillation. 

Graphical Summary of the Results 

The substance of Tabic 17.2, which contains as many as five different possible cases of a 
specification, can be made much easier to grasp if the results are presented graphically. 
Inasmuch as the a specification involves essentially a comparison of the relative magni¬ 
tudes of the parameters a and (0 + (5), let us plot a against (0 + $). as in Fig. 17.3. Note 
that we need only concern ourselves with the positive quadrant because, by model specifi¬ 
cation, cr and (0 + 5 ) are both positive. From Tabic 17.2, it is clear that regions IV and VI 
are specified by the equations a = 1/(0 + (5) and a = 2/(0 + <$), respectively. Since each 
of these plots as a rectangular hyperbola, the two regions are graphically represented by the 
two hyperbolic curves in Fig. 17.3. Once we have the two hyperbolas, moreover, the other 
three regions immediately fall into place. Region III, for instance, is merely the set of points 
lying below the lower hyperbola, whore we have a less than I /(0 + <$). Similarly, region V 
is represented by the set of points falling between the two hyperbolas, whereas all the points 
located above the higher hyperbola pertain to region VII. 
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Example 2 


If a = p = 1, and 5 = will our model (17.13) yield a convergent time path P,? The 
given parametric values correspond to point ,4 in Fig. 17.3. Since it falls within region V, the 
time path is convergent, though oscillatory. 


You will note that, in the two models just presented, our analytical results are in each 
instance stated as a set of alternative possible cases -three types of oscillatory path for 
the cobwebs, and five types of time path in the inventory model. This richness of analytical 
results stems, of course, from the parametric formulation of the models. The fact that our 
result cannot be stated in a single unequivocal answer is, of course, a merit rather than a 
weakness. 


EXERCISE 17.5 

1. In solving (17.14), why should formula (17,8') be used instead of (17.9')? 

2. On the basis of Table 17.2, check the validity of the translation from the b specification 
to the <7 specification for regions IV through VII. 

3. If model (17.13) has the following numerical form: 

Qd. = 21 -IP, 

Qs < — — 3 + 6Pj 
P;-i = Pi - 0-3(Qsi - Qdr) 

find the time path P, and determine whether it is convergent. 

4. Suppose that, in model (17.13), the supply in each period is a fixed quantity, say, 
Q Jt = k, instead of a function of price. Analyze the behavior of price over time. What 
restriction should be imposed on k to make the solution economically meaningful? 


17.6 Nonlinear Difference Equations—The 

Qualitative-Graphic Approach _ 

Thus far we have only utilized linear difference equations in our models; but the facts of 
economic life may not always acquiesce to the convenience of linearity. Fortunately, when 
nonlinearity occurs in the case of first-order difference-equation models, there exists an 
easy method of analysis that is applicable under fairly general conditions. This method, 
graphic in naiure. closely resembles that of the qualitative analysis of first-order differen¬ 
tial equations presented in Sec. 15.6. 

Phase Diagram 

Nonlinear difference equations in which only the variables y, + ] and y, appear, such as 

y,. ¥ i -i-y, 3 = 5 or y r _! + siny, - Lny, = 3 

can be categorically represented by the equation 

>7-i = /'(») (17.16) 

where/can be a function of any degree of complexity, as long as it is a function ofy, alone 
without i as another argument. When the two variables y, . ( i and y> arc plotted against each 
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other in a Cartesian coordinate plane, the resulting diagram constitutes a phase diagram, 
and the curve corresponding to /is a phase line. From these, it is possible to analyze the 
lime path of the variable by the process of iteration. 

The terms phase diagram and phase line arc Ussed here in analogy to the differential- 
equation case; but note one dissimilarity in the construction oft he diagram. In the differential- 
equation case, we plotted dy/df against y as in Fig. 15.3, so that, in order to be perfectly 
analogous in the present case, we should have A y t on the vertical axis and y t on the hori¬ 
zontal. This is not impossible to do, but it is much more convenient to place v l+ ] on the ver¬ 
tical axis instead, as we have done in Fig. 17.4 where the same scale is used on both axes. 
Note the presence of a 45° line in each diagram of Fig. 17.4; this line will prove to be of 
great service in carrying out our graphic analysis. 

Let us illustrate the procedure involved by means of Fig. 17.4a, where we have drawn a 
phase line (labeled /i) representing a specific difference equation y,.j = /)(»). If we are 





ic) id) 


0 
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given an initial value (plotted on the horizontal axis), by iteration we can trace out all the 
subsequent values of y as follows. First, since the phase line j\ maps the initial value >'o 
into vj according to the equation 

y] = /ib'o) 

we can go straight up from yn to the phase line, hit point A, and read its height on the ver¬ 
tical axis as the value of vi. Next, we seek to map y\ into y; according to the equation 

,' ; 2 = 

For this purpose, we must first plot >i on the horizontal axis • -similarly to y () during the 
first mapping. This required transplotting of y\ from the vertical axis to the horizontal is 
most easily accomplished by the use of the 45° line, which, having a slope of 4-1, is the 
locus of points with identical abscissa and ordinate, such as (2, 2) and (5. 5). Thus, to 
transplot yj from the vertical axis, wo can simply go across to the 45° line, hit point 6, and 
then turn straight down to the horizontal axis to locate the point )v By repeating this 
process, we can map to >2 via point C on the phase line, and then use the 45 s line for 
transplotting >' 2 , etc. 

Now that the nature of the iteration is clear, we may observe that the desired iteration can 
be achieved simply by following the arrowheads from.i’o to el (on the phase lino), to B (on 
the 45° line), to C (on the phase line), etc.—always alternating between the two lines— 
without it ever being necessary to resort to the axes again. 

Types of Time Path 

The graphic iterations just outlined are, of course, equally applicable to the other three 
diagrams in Fig. 17.4. Actually, these four diagrams serve to illustrate four basic varieties 
of phase lines, each implying a different type of time path. The first two phase lines, j\ and 
fi, are characterized by positive slopes, with one slope being less than unity and the other 
one greater than unity: 

0 < 1 and / 2 O/) > 1 

The remaining two, on the other hand, arc negatively sloped; specifically, we have 
-1 < ^ and / d (») < -1 

In each diagram of Fig. 17.4, the intertemporal equilibrium value of y (namely y) is 
located at the intersection of the phase line and the 45° line, which we have labeled E. This 
is so because the point E on the phase line, being simultaneously a point on the 45° line, 
will map a v> into a \ of identical value; and when v>_| = y t , by definition y must be in 
equilibrium intertemporally. Our principal task is to determine whether, given an initial 
value yo v. the pattern of change implied by the phase line will lead us consistently 
toward y (convergent) or away from it (divergent). 

For the phase line /j. the iterative process leads from yo to y in a steady path, without 
oscillation. You cun verify that, if vo is placed to the right of”, there will also be a steady 
movement toward y\ although il will be in the leftward direction. These time paths are con¬ 
vergent to equilibrium, and their general configurations would be of the same type as 
show n in region 111 of Fig. 17.1. 
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Given the phase line fi, whose slope exceeds unity, however, a divergent time path 
emerges. From an initial value ytj greater than y. the arrowheads lead steadily away from 
the equilibrium to higher and highery values. As you can verify, an initial value lower than 
y gives rise to a similar steady divergent movement, though in the opposite direction. 

When the phase line is negatively inclined, as in fi and fi, the steady movement gives 
way to oscillation, and there appears now the phenomenon of overshooting the equilibrium 
mark. In diagrams, ya leads to yi, which exceeds?, only to be followed by yi. which fails 
short of y, etc. The convergence of the time path will, in such cases, depend on the slope of 
the phase line being less than 1 in its absolute value. This is the case of the phase line fi , 
where the extent of overshooting tends to diminish in successive periods. For the phase line 
fi, whose slope exceeds 1 numerically, on the other hand, the opposite tendency prevails, 
resulting in a divergent time path. 

The oscillatory time paths generated by phase lines fi and fi are reminiscent of the cob¬ 
webs in Fig, 17.2. In Fig. 17.4c or d, however, the cobweb is spun around a phase line 
(which contains a lag) and the 45° line, instead of around a demand curve and a (lagged) 
supply curve. Here, a 45° line is used as a mechanical aid for transplotting a value of i-\ 
whereas in Fig. 17.2, the D curve (which plays a role similar to that of the 45 v line in 
Fig. 17.4) is an integral part of the model itself. Specilically, once Q :l is determined on the 
supply curve, we let the arrowheads hit the D curve for the purpose of finding a price that 
will “clear the market," as was the rule of the game in the cobweb model. Consequently, 
there is a basic difference in the labeling of the axes: in Fig. 17.2 there are two entirely dif¬ 
ferent variables, P and Q, but in Fig. 17.4 the axes represent the values of the same variable 
y in two consecutive periods. Note however, that if we analyze the graph of the difference 
equation (17.11) which summarizes the cobweb model, rather than the separate demand 
and supply functions in (17.10). then the resulting diagram will be a phase line such as 
shown in Fig, 17,4, In other words, there really exist two alternative ways of graphically 
analyzing the cobweb model, which will yield the identical result. 

The basic rule emerging from the preceding consideration of the phase line is that the 
algebraic sign of its slope determines whether there will be oscillation, and the absolute 
value of its slope governs the question of convergence. If the phase line happens to contain 
both positively and negatively sloped segments, and if the absolute value of its slope is at 
some points greater, and elsewhere less, than 1, the time path will naturally become more 
complicated. Flowever, even in such cases, the graphic-iterative analysis can be employed 
with equal ease. Of course, an initial value must be given to us before the iteration can be 
duly started. Indeed, in these more complicated cases, a different initial value can lead to a 
time path of an altogether different breed (see Kxercises 17.6-2 and 17.6-3). 

A Market with a Price Ceiling 

Wc shall now cite an economic example of a nonlinear difference equation. In Fig. 17.4, the 
four nonlinear phase lines all happen to be of the smooth variety; in the present example, 
we shall show' a nonsmooth phase line. 

As a point of departure, let us take th e linear difference equation (17.11) of the cobweb 
model and rcw'fftc it as 


P, +] = 


a + y 


-P, 

P P 



( 17 . 17 ) 
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FIGURE 17.5 


p 
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This is in the format of P, + j — /(P,), with f'(Pi) = —&/fi < 0. We have plotted this 
linear phase line in Fig. 17.5 on the assumption that the slope is greater than 1 in absolute 
value, implying explosive oscillation. 

Now let there be imposed a legal price ceiling P (read: “P caret" or, less formally. “P 
hat"). This can be shown in Fig. 17.5 as a horizontal straight^line because, irrespective of 
the level of P„ P,+ 1 is now forbidden to exceed the level of P. What this does is to invali¬ 
date that pan of the phase line lying above P or, to view it differently, to bend down the 
upper part of the phase line to the level of A thus resulting in a kinked phase line. 1 In 
view of the kink, the new (heavy) phase line is not only nonlinear but nonsmooth as well. 
Like a step function, this kinked line will require more than one equation to express it 
algebraically: 

P (for P, < k) 

P ; . M = « + F ^ jj , f B n (17.17') 

—- -Pi (for P,>k) 

P P 

where k denotes the value of P, at the kink. 

Assuming an initial price Pd, let us trace out the time path of price iteratively. During the 
first stage of iteration, when the downward-sloping segment of the phase line is in effect, 
the explosive oscillatory tendency clearly manifests itself. After a few periods, however, the 
arrowheads begin to hit the ceiling price, and thereafter the time path will develop into a 
perpetual cyclical movement between P and an effective price floor P (road: "P tilde” 
or, less formally, “P wiggle”). Thus, by virtue of the price ceiling, the intrinsic explosive 
tendency of the model is effectively contained, and the ever-widening oscillation is now 
tamed into a uniform oscillation producing a so-called limit cycle. 

* Strictly speaking, we should also "bend" that part of the phase line lying to the right of the point 
p on the horizontal axis. But it does no harm to leave it as it is, as long as the other end has already 
been bent, because the transplotting of P t+ i to the horizontal axis will carry the upper limit of P 
over to the P f axis automatically. 
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What is significant about this result is that, whereas in the case of a linear phase line a 
uniformly oscillatory path can be produced if and only if the slope of the phase line is - I. 
now after the introduction of nonlinearity the same analytical result can arise even when the 
phase line has a slope other than -1, The economic implication of this is of considerable 
import. If one observes a more or less uniform oscillation in the actual time path of a vari¬ 
able and attempts lo explain it by means of a lincur model, one will be forced to rely on the 
rather special—and implausible model specification that the phase-line slope is exactly 
-!. But if nonlinearity is introduced, in either the smooth or the nonsrnooth variety, then a 
host of more reasonable assumptions can be used, each of which can equally account lor 
the observed feature of uniform oscillation. 


EXERCISE 17.6 


1. In difference-equation models, the variable t can only take integer values. Does this 
imply that in the phase diagrams of Fig. 17,4 the variables and y,_i must be consid¬ 
ered as discrete variables? 

2. As a phase line, use the left half of an inverse U-shaped curve, and let it intersect the 
45° line at two points t (left) and R (right). 

(a) Is this a case of multiple equilibria? 

{£>) If the initial value y 0 lies to the left of i, what kind of time path will be obtained 7 

(c) What if the initial value lies between L and fl? 

(d) What if the initial value lies to the right of R? 

(e) What can you conclude about the dynamic stability of equilibrium at L and at R, 
respectively? 

3. As a phase line, use an inverse U-shaped curve. Let its upward-sloping segment inter¬ 
sect the 45° line at point L, and let its downward-sloping segment intersect the 45 5 line 
at point R. Answer the same five questions raised in the Prob. 2. {Note: Your answer will 
depend on the particular way the phase line is drawn; explore various possibilities.) 

4. In Fig. 17.5, rescind the legal price ceiling and impose a minimum price P m instead, 
(a) How will the phase line change? 

(fa) Will it be kinked? Nonlinear? 

(c) Will there also develop a uniformly oscillatory movement in price? 

5. With reference to (17.17') and Fig. 17.5, show that the constant k can be expressed as 


S 



Chapter 


Higher-Order 
Difference Equations 


The economic models in Chap. 37 involve difference equations that relate P, and P,-\ to 
each other. As the P value in one period can uniquely determine Ihe P value in the next, the 
time path of P becomes fully determinate once an initial value Pq is specified. It may hap¬ 
pen, however, that the value of an economic variable; in period t (say, y,) depends not only 
on but also on y, 2 . Such a situation will give rise to a difference equation of the 
second order. 

Strictly speaking, a second-order difference equation is one that involves an expression 
A : », called the second difference of y„ but contains no differences of order higher than 2. 
The symbol A 2 , the discrete-time counterpart of the symbol is an instruction to 

“take the second difference'’ as follows: 

A 2 .v r = A(Ay ( ) = A(y,+i - y,) [by (J 7. OJ 

= -»+i) -(>v+i ~)i) |agamby (17.1)] 1 

= y , + 2 - 2 V’, + 1 + >'r 

Thus a second difference of y, is transformable into a sum of terms involving a two-period 
time lag. Since expressions like A 2 y, and Ay r are quite cumbersome to work with, we shall 
simply redefine a second-order difference equation as one involving a two-period time lag 
in the variable, Similarly, a third-order difference equation is one that involves a three- 
period time lag, etc. 

Let us first concentrate on the method of solving a second-order difference equation, 
leaving the generalization to higher-order equations in Section 18.4. To keep the scope of 
discussion manageable, we shall only deal with linear difference equations with constant, 
coefficients in the present chapter. However, both the constant-term and variable-tern vari¬ 
eties will be examined. 

+ That is, we first move the subscripts in the (yt+i - yi) expression forward by one period, to get 
a new expression (y^2 ~ y+l). and then we subtract from the latter the original expression. Note 
that, since the resulting difference may be written as Ay (+1 - Ay ( , we may infer the following rule 
of operation: 

A(y !+ i - y t ) = Ay f _i - An 

This is reminiscent of the rule applicable to the derivative of a sum or difference. 
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18.1 Second-Order Linear Difference Equations 

with Constant Coefficients and Constant Term 


A simple variety of second-order difference equations takes the form 

yr+i + Giyt n +«2 yi =(■' ( 18 . 1 ) 

You will recognize this equation to be linear, nonhomogeneous, and with constant coefli- 
cicnts (ai, a 2 ) and constant term c. 

Particular Solution 

As before, the solution of (18.1) may be expected to have two components: a particular 
solution y p representing the intertemporal equilibrium level of y, and a complementary 
function y, : specifying, for every time period, the deviation from the equilibrium. The 
particular solution, defined as any solution of the complete equation, can sometimes be 
found simply by trying a solution of the form y, — k. Substituting this constant value of v 
into (18.1), we obtain 

ka\k + a 2 k — c and k— - 

1 + U\ + Hi 

Thus, so long as (1 4- ct\ + a 2 ) 0, the particular integral is 

vJ=k) = --- (case of (?i +«■> =£ -1) (18.2) 

1 + a | T a? 


Example 1 


Find the particular integral of y t + 2 - 3y !+ i +4y t = 6. Here we have Oi = -3, a 2 = 4, and 
c = 6. Since oi + a 2 ^ -1, the particular solution can be obtained from (18.2) as follows: 


y P 


6 

1-3+4 


= 3 


In case a\ + a 2 = -1, then the trial solution y, = k breaks down, and we can try 
v, = ki instead. Substituting the latter into (18.1) and bearing in mind that we now have 
i’f+i = k(t + 1) and y,, 2 = + 2), we find that 


and 


k(t + 2) + a x k(t + I) + a 2 ki = c 


k = 


(1 + a x + a 2 )t + a 1 +2 c>i +2 


1 ] 


Thus we can write the particular solution as 

»(= kt) = l (case of «, + a 2 = -l;o, ^ -2) (18.2') 
tfi 4- 2 


Example 2 


Find the particular solution of y !+2 + yt+\ -2y,= 1 2. Here, a x = 1, a 2 = -2, and c = 12. 
Obviously, formula (18.2) is not applicable, but (18.2 ) is. Thus, 


Yp 


12 


1+2 


This particular solution represents a moving equilibrium. 
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If a i +c ?2 = -l, but at the same time a) = -2 (that is, itrtj = -2 and a-i = 1). then we 
can adopt a trial solution of the form;> = kt 2 , which implies y,_j = k(l + I} 2 , etc. As you 
may verify, in this case the particular solution turns out to be 

y p = kr = -t 1 (case of <7 ] = -2 = I) ( 18 , 2 ") 

However, since this formula applies only to the unique case of the difference equation 
v ,+2 - 2 >v+i + y t = c. its usefulness is rather limited. 


Complementary Function 

To tind the comp I ancillary function, wc must concentrate on the reduced equation 

J'm 2 4- d]) } r~] +^2.V/ =0 (18.3) 

Our experience with first-order difference equations has taught us that the expression Ah 1 
plays a prominent role in the general solution of such an equation. Let us therefore try a 
solution of the form y t = Ab l , which naturally implies that y (+ \ = Ah ' T l , and so on. It is 
our task now' to determine the values oiA and /). 

Upon substitution of the trial solution into (183), the equation becomes 

Ah 712 +a)Ab’+ l A-a-iAh* = 0 


or, after canceling the (nonzero) common factor Ab 1 , 

b 2 + a\b + — 0 (18.3') 


This quadratic equation -the characteristic equation of (183) or of (18.1)—which is com¬ 
parable to (16.4"), possesses the two characteristic roots 


b\* b ? = 


—a\ ± Ja 2 - 4c?2 


(18.4) 


each of w hich is acceptable in the solution Ah 1 . In fact, both b \ and hi should appear in the 
general solution of the homogeneous difference cqualion (18.3) because, just as in the case 
of differential equations, this general solution must consist of two linearly independent 
parts, each with its own multiplicative arbitrary constant 

Three possible situations may be encountered in regard to the characteristic roots, 
depending on the square-root expression in (18.4). You will find these parallel very closely 
the analysis of second-order differential equations in Sec. 16.1. 

Case 1 (distinct real roots) When a] > 4 a 2f the square root in (18.4) is a real number, 
and b) and fa are real and distinct. In that event, b\ and b\ are linearly independent, and the 
complementary function can simply be written as a linear combination of these expres¬ 
sions; that is, 

v\ ; = A\b\ + ^262 (18.5) 


You should compare this with (16.7). 


Example 3 


Find the solution of yt +1 -f Yi n - 2y s = 12. This equation has the coefficients oi = 1 and 
02 = 2; from (18.4), the characteristic roots can be found to be bu b 2 = 1, -2. Thus, the 
complementary function is 

y r =A 1 (iy + 4 2 (-2) t = 4 1 +4 2 (-2) r 
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Since, in Example 2, the particular solution of the given difference equation has already 
been found to be y p = At, we can write the general solution as 

yt = Yc+y P =Ai + A 2 (- 2) f + 4f 

There are still two arbitrary constants A 5 and Az to be definitized; to accomplish this, (wo 
initial conditions are necessary. Suppose that we are given y 0 = 4 and yi = 5. Then, since 
by letting t = 0 and t = 1 successively in the general solution we find 

y 0 = A\ + A 2 (= 4 by the first initial condition) 

yi = A\ - 2Ai +4 (= 5 by the second initial condition) 

the arbitrary constants can be definitized to A] = 3 and Az = 1. The definite solution then 
can finally be written as 

Yt = 3 + (-2)'+4t 

Case 2 (repeated real roots) Whenu^ = 4«2, the square root in (18.4) vanishes, and the 
characteristic roots are repeated: 

/;(=/>, =*2) = -y 

Now, if we express the complementary function in the form of (18,5), the two components 
will collapse into a single term: 

A\b\ + A 2b 2 = {A\ + Azjh 1 = A\h' 

This will not do, because we are now short of one constant. 

To supply the missing component—which, we recall, should be linearly independent of 
the term A^b' the old trick of multiplying b' by the variable t will again work. The new 
component term is therefore to take the form Azib 1 . That this is linearly independent of 
Aib' should be obvious, for we can never obtain the expression Aztb' by attaching a con¬ 
stant coefficient to A^b', That Aztb 1 does indeed qualify as a solution of the homogeneous 
equation (18.3), just as A 3 b' does, can easily be verified by substituting y, = Aztb 1 [and 
v f+ i = Az(t + 1 )b'~\ etc.] into (18,3)' and seeing that the latter will reduce to an identity 
0 = 0 . 

The complementary function for the repeated-root case is therefore 

y r = A } b‘+ A 4 tb‘ (18.6) 

which you should compare with (16.9). 


Example 4 


Find the complementary function of y,-z + 6y,*i -+9y t = 4. The coefficients being 01 = 6 
and 02 = 9, the characteristic roots are found to be b] = bz = - 3. We therefore have 


y ( = A 3 (-3)‘+ A 4 t(-3)' 

If we proceed a step further, we can easily find y p = so the general solution of the 
given difference equation is 

Yt= 4 3 (-3)'-M4t(-3) ! f 3 

Given two initial conditions, ^3 and A 4 can again be assigned definite values. 

* In this substitution it should be kept in mind that we have in the present case o] = 4a* and 
b= -fli/2. 
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Case 3 (complex roots) Under the remaining possibility of a] < Aa 2 , the characteristic 
roots arc conjugate complex. Specifically, they will be in the form 

h\,bi=h± vi 


where 



The complementary function itself thus becomes 

y c = A\b\ + A 2 b' 2 = A\{h + 17V + A 2 (h - vi) 1 


(18.7) 


As it stands, y c is not easily interpreted. But fortunately, thanks to Dc Moivre’s theorem, 
given in (16.23'), this complementary function can easily be transformed into trigonomet¬ 
ric terms, which we have learned to interpret. 

According to the said theorem, we can write 


(k ± vi)' — R'{coiOt ± i sinOt) 


where the value of R (always taken to be positive) is, by (16.10), 


* = = g ' =V^ 08 . 8 ) 

and 0 is the radian measure of the angle in the interval [0, 2 n), which satisfies the 
conditions 

cosf? = ^ = —and siiit* = ^ = . /1 — — L (18.9) 

R 2^/ai R y 4« 2 

Therefore, the complementary function can be transformed as follows: 

+ i sintff) + ^ 2 /?'(cosf?f - i sin ^/) 

= i?'[(/4| + A 2 )cosOt + [A i - A 2 )i sin0f] 

= R'(As coaOt + A* sintii) (18.10) 

where we have adopted the shorthand symbols 

A$ = A\ +A 2 and A f , = (A\- A 2 )i 

The complementary function (18.10) differs from its differential-equation counterpart 
(16.24') in two important respects. First, the expressions cos tit and sin Or have replaced the 
previously used cos vt and sin vt. Second, the multiplicative factor R‘ (an exponential with 
base R) has replaced the natural exponential expression e hl . In short, we have switched 
from the Cartesian coordinates (/i and v) of the complex roots to their polar coordinates 
(R and d). The values off? and 0 can be determined from (18.8) and (18.9) once h and v 
become known. It is also possible to calculate R and ti directly from the parameter values 
a { and a 2 via (18.8) and (18.9), provided we first make certain that a\ < 4 a 2 and thai the 
roots are indeed complex. 
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Example 5 


Example 6 


Find the general solution of /t +2 + ^ yr = 5. With coefficients Oi = 0 and 02 = 5 , this 
constitutes an illustration of the complex-root case of < Aa 2 . By (18.7), the real and 
imaginary parts of the roots are h = 0 and v = j. It follows from (18.8) that 


ft = 




1_ 

2 


Since the value of ft is that which can satisfy the two equations 

h , v 

cos 0 = —=0 and sinft = —= 1 
ft ft 


it may be concluded from Table 16.1 that 


° = *2 


Consequently, the complementary function is 


Vc= (^C05|f-M 6 sin|r 


To find y p , let us try a constant solution y t = k in the complete equation. This yields 
k = A; thus, y p = 4, and the general solution can be written as 


Yt — 


1 

2 


Ai cos^f+ 4 6 sin^f) +4 


(18.11) 


Find the general solution of y ( , 2 - 4y ( , 1 4 - 16y t = 0. In the first place, the particular solu¬ 
tion is easily found to be y p = 0. This means that the general solution y t (= y,-+ y p ) will be 
identical with y c . To find the latter, we note that the coefficients oi = -4 and 02 = 16 do 
produce complex roots. Thus we may substitute the 01 and a 2 values directly into (18.8) 
and (18.9) to obtain 


ft = VTd = 4 
4 1 


cos ft = 


' 16 .'3 

and sin ft = V /1 - —— = J- 
V 4-16 v 4 


2-4 2 

The last two equations enable us to find from Table 16.2 that 


73 

2 


It follows that the complementary function—which also serves as the general solution 
here—is 


Yc(= Ki)-4 ! ^cos|r+4 6 sin|tj (18.12) 

The Convergence of the Time Path 

As in the case of first-order difference equations, the convergence of the time path v, hinges 
solely on whether^, tends toward zero as t —» oo. What we learned about the various con¬ 
figurations of the expression b\ in Fig. 17.1. is therefore still applicable, although in the 
present context we shall have to consider two characteristic roots rather than one. 

Consider first the case of distinct real roots: b\ ^ b>. If |&il > i and l^l > 1, then 
both component terms in the complementary function (18.5)— A\b\ and Aib l 2 will be 
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Example 7 


explosive, and thus >y must be divergent. In the opposite case of |i>i | < 1 arid \b 2 \ < 1, both 
terms in y r will converge toward zero as t is indefinitely increased, as will >y also. What if 
\b { | > 1 but |/> 2 < 1? In this intermediate case, it is evident that the A 2 b‘ 2 term tends to 
"die down" while the other term tends to deviate farther from zero. It fol lows that the A { b\ 
term must eventually dominate the scene and render the path divergent. 

Let us call the root with the higher absolute value the dominant root. Then it appears that 
it is the dominant root lh which really sets the tone of the time path, at least with regard to its 
ultimate convergence or divergence. Such is indeed the ease. We may state, thus, that a time 
path will be convergent—whatever the initial conditions may be—if and only if the dominant 
mot is less than 1 in absolute value. You can verily that this statement is valid for the cases 
where both roots arc greater than or less than 1 in absolute value (discussed previously), and 
where one root has an absolute value of 1 exactly [not discussed previously). Note, however, 
that even though the eventual convergence depends on the dominant root alone, the non- 
dominant root will exert a definite influence on the time path, too, at least in the beginning 
periods. Therefore, the exact configuration of y, is still dependent on both roots. 

Turning to the repeated-root case, we find the complementary function to consist of the 
terms Ayb 1 and Aitb 1 , as shown in (18.6). The former is already familiar to us, but a word 
of explanation is still needed for the latter, which involves a multiplicative t. If 161 > 1, the 
b‘ term will be explosive, and the multiplicative t will simply serve to intensify the explo¬ 
siveness as r increases. If |6| < 1, on the other hand, the b‘ part (which tends to zero as t in¬ 
creases) and the t part will run counter to each other; i.e., the value of t will offset rather 
than reinforce b'. Which force will prove the stronger? The answer is that the damping 
force of// will always win over the exploding force of t. For this reason, the basic require¬ 
ment for convergence in the repeated-root case is still thai the root be less than 1 in absolute 
value. 

Analyze the convergence of the solutions in Examples 3 and 4. For Example 3, the solution is 

tt = 3^(-2)' + 4f 

where the roots are 1 and -2, respectively [3(1 )' = 3], and where there is a moving equi¬ 
librium 4t. The dominant root being -2, the time path is divergent. 

For Example 4, where the solution is 

y ( = A3(-3)' + A 4 f(-3) r +l 

and where \b\ - 3, we also have divergence. 

Let us now consider the complex-root ease. From the general form of the complemen¬ 
tary function in f 18.10), 

>y = R'(As cos 0t + /IrtSinth) 

it is clear that the parenthetical expression, like the one in (16,24'), will produce a fluctuat¬ 
ing pattern of a periodic nature. However, since the variable t can only take integer values 
0, 1,2,... inthe present context, we shall catch and utilize only a subset of the points on 
the graph of a circular function. The y value at each such point will always prevail for a 
whole period, till the next relevant point is reached. As illustrated in Fig. 18.1. the resulting 
path is neither the usual oscillatory type (not alternating between values above and below 
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FIGURE 18.1 



y p in consecutive periods), nor the usual fluctuating type (not smooth); rather, it displays a 
sort of stepped fluctuation. As far as convergence is concerned, though, the decisive factor 
is really the R r term, which, like the e h/ term in (16.24'), will dictate whether the stepped 
fluctuation is to be intensified or mitigated as r increases. In the present ease, the fluctua¬ 
tion can be gradually narrowed down if and only if R < 1. Since R is by definition the 
absolute value of the conjugate complex roots (h ± ui), the condition for convergence is 
again that the characteristic roots be less than unitv in absolute value. 


To summarize: For all three cases of characteristic roots, the time path will converge to 
a (Stationary or moving) intertemporal equilibrium-regardless of what the initial condi¬ 
tions may happen to be—if and only if the absolute value of every root is less than I. 

Example 8 Are the time paths (18.11) and (18.12) convergent? in (18.11) we have R -= •; therefore 

- the time path will converge to the stationary equilibrium (= 4). In (18.12), on the other 

hand, we have R = 4, so the time path will not converge to the equilibrium {= 0). 


EXERCISE 18.1 

1. Write out the characteristic equation for each of the following, and find the character¬ 
istic roots: 

1 1 1 
(o) Yi +2 - n+1 + 2 * - 2 to Vi +2 + - 2 y ' = 5 

(b) yi +2 ~ 4m i + Ay t = 7 (d) y, +2 - 2y f .,i + 3y f = 4 

2. For each of the difference equations in Prob. 1 state on the basis of its characteristic 
roots whether the time path involves oscillation or stepped fluctuation, and whether it 
is explosive. 

3. Find the particular solutions of the equations in Prob. 1. Do these represent stationary 
or moving equilibria? 

4. Solve the following difference equations: 

(o) y, +2 + 3y (+ i - 7 -y t = 9 (y 0 = 6; fl = 3) 

(b) y ,+ 2 - 2y l+ 1 + 2y, = 1 (y 0 = 3; y, = 4) 

to H+2-K+i + j)* = 2 Oo = 4; y\ -7) 

5. Analyze the time paths obtained in Prob. 4. 
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18.2 Samuelson Multiplier-Acceleration Interaction Model 

As an illustration of the use of second-order difference equations in economics, let us cite a 
classic work of Professor Paul Samuelson, the first economist to win the Nobel Prize. We 
refer to his classic interaction model, which seeks to explore the dynamic process of income 
determination when the acceleration principle is in operation along with the Keynesian mul¬ 
tiplier.* Among other things, that mode) serves to demonstrate that the mere interaction of the 
multiplier and the accelerator is capable of generating cyclical lluctuations endogenously. 

The Framework 

Suppose that national income Y, is made up of three component expenditure streams: con¬ 
sumption C,, investment /,. and government expenditure G,. Consumption is envisaged as 
a function not of current income but of the income of the prior period, F,_i; for simplicity, 
it is assumed that C, is strictly proportional to K,_[. Investment, which is of the '"induced" 
variety, is a function of the prevailing trend of consumer spending, ft is through this induced 
investment, of course, that the acceleration principle enters into the model. Specifically, we 
shall assume I, to bear a fixed ratio to the consumption increment AC,_| = C, - C,. [.The 
third component, G,, on the other hand, is taken to be exogenous; in fact, wc shall assume 
it to be a constant and simply denote it by Go. 

These assumptions can be translated into the following set of equations: 

Y, = C, + /, + Go 

C, = yY t - ] (0 < y < 1) (18.13) 

4 = a(C, - C,- 1) {a > 0) 

where y (the Greek letter gamma) represents the marginal propensity to consume, and a 
stands for the accelerator (short for acceleration coefficient). Note that, if induced invest¬ 
ment is expunged from the model, wc are left with a first-order difference equation which 
embodies the dynamic multiplier process (cf. Example 2 of Sec. 17.2). With induced 
investment included, however, wc have a second-order difference equation that depicts the 
interaction of the multiplier and the accelerator. 

By virtue of the second equation, we can express I, in terms of income as follows: 

4 =a{yY t - 1 - yYt-i) =ay{Y l _ i - T_ 2 ) 

Upon substituting this and the C, equation into the first equation in (18.13) and rearrang¬ 
ing, the model can be condensed into the single equation 

Y, - y(l +a)y t _i +uyT, 2 = G 0 

or. equivalently (after shifting the subscripts forward by two periods). 

Y,+2 - y(l +o)4-i +<*yY r = Go (18.14) 

Because this is a second-order linear difference equation with constant coefficients and 
constant term, it can be solved by the method just learned. 

t Paul A. Samuelson, "Interactions between the Multiplier Analysis and the Principle of Acceleration," 
Review of Economic Statistics, May 1939, pp. 75-78; reprinted in American Economic Association, 
Readings in Business Cycle Theory, Richard D. Irwin, Inc., Homewood, III,, 1944, pp. 261-269. 
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The Solution 

As the particular solution. \vc have, by (18.2), 

y __Jjo__ 

’’ l-yd+uj + ay 1-y 

It may be noted That the expression 1/(1 - y) is merely the multiplier that would prevail in 
the absence of induced investment. Thus G’o/(l - y)—the exogeneous expenditure item 
times the multiplier—should give us the equilibrium income Y* in the sense that this in¬ 
come level satisfies the equilibrium condition “national income — total expenditure" [cl". 
(3.24)]. Being the particular solution of the model, however, it also gives us the intertem¬ 
poral equilibrium income F. 

With regard to the complementary function, there are throe possible cases. Case 1 
(a\ > 4«2). in the present context, is characterized by 

y 2 (l + a) 2 > 4ay or y(l +a) 2 > 4a 


FIGURE 18.2 


or 

4a 

y >-r 

(\+a ) 2 

Similarly, to characterize Cases 2 and 3, we only need to change the > sign in the last 
inequality to = and <, respectively. In Fig. 18.2. we have drawn the graph of the equation 
y - 4»/(l + a) 2 . According to the preceding discussion, the (a, y) pairs that are located 
exactly on this curve pertain to Case 2. On the other hand, the (a, y) pairs lying above this 
curve (involving higher y values) have to do with Case 1, and those lying below the curve 
with Case 3. 

This tripartite classification, with its graphical representation in Fig. 18.2. is ofinicrcst 
because it reveals dearly the conditions under which cyclical fluctuations can emerge 



1C [29 Stable; no cycles 
2C Stable; no eye lev 
3C 222 Damped stepped Hue tuition 


ID ESS Unstable; no cycles 
2D "N Unstable: no cycles 
3D 1.1 Explosive stepped fluctuation 
3D's Uniform stepped fluctuation 
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endogenously from the interaction of the multiplier and the accelerator. But this tells noth¬ 
ing about the convergence or divergence of the time path of Y. It remains, therefore, for us 
to distinguish, under each case, between the damped and the explosive subcases. We could, 
of course, take the easy way out by simply illustrating such subcases by citing specific 
numerical examples. But let us attempt the more rewarding, if also more arduous, task of 
delineating the general conditions under which convergence and divergence will prevail. 

Convergence versus Divergence 

The difference equation (18.14) has the characteristic equation 

b 2 - y( 1 + a)b + try = 0 


which yields the two roots 

y(l +a) ± /y 2 ( 1 +a ) 2 -4c ty 
b i, b 2 -^- 

Since the question of convergence versus divergence depends on the values offo and h, 
and since hi and 6 :, in turn, depend on the values of the parameters u andy, the conditions 
for convergence and divergence should be expressible in terms of the values of a and y. 
To do this, we can make use of the fact that—by (16.6)—the two characteristic roots are 
always related to each other by the following two equations: 

b\ + b 2 = y(l + a) (18.15) 

b { b 2 =ay (18.15') 

One the basis of these two equations, we may observe that 

- b t )() - h) = i - + b 2 ) + 

= 1 - y(l +Cf) +ay = 1 - y (18,16) 

In view of the model specification that 0 < y < 1, it becomes necessary to impose on the 
two roots the condition 

0 < (1 -fc|)(l -b 2 ) < 1 (18.17) 

Let us now examine the question of convergence under Case 1, where the roots are real 
and distinct. Since, by assumption, a and y arc both positive, (18.15') tells us that 
h\b 2 > 0, which implies that b\ and b 2 possess the same algebraic sign. Furthermore, since 
y(l +a) > 0,(18.15) indicates that both 7>i and />2 must be positive. Hence, the time path 
K, cannot have oscillations in Case 1. 

Even though the signs of b i and b 2 are now known, there actually exist under Case 1 as 
many as five possible combinations of (b\,h 2 ) values, each with its own implication 
regarding the corresponding values for a and y ; 


(/) 

0 

< 

h 

< 

b\ < 1 


0 

< 

y < 1; ay < 


0 

< 

h 

< 

A,=l 


Y 

— 

1 

m 

0 

< 

h 

< 

1 < b\ 


Y 

> 

1 


1 

= 

b 2 

< 

hi 


Y 

= 

1 

(V) 

1 

< 

b 2 

< 

hi 


0 

< 

y < 1: ay > 
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Possibility i, where both b , anti hi are positive fractions, duly satisfies condition (18.17) 
and hence conforms to the model specification 0 < y < 1. The product of the two roots 
must also be a positive fraction under this possibility, and this, by (18.15'). implies that 
ay < 1. in contrast, the next three possibilities all violate condition (18.17) and result in 
inadmissible y values (see Exercise 18.2-3). Hence they musi be ruled out. But Possibility 
v may still be acceptable. With both h\ and b 2 greater than one, (18.17) may still be satis¬ 
fied if (l -M(l ~bi) < 1. But Ihis time we have ay > 1 (rather lhan < 1) from 
(18.15'). The upshot is that there are only two admissible subcases under Case I. The 
first - Possibility i — involves fractional roots h\ and hi, and therefore yields a convergent 
time path of Y. The other subcase—Possibility v features roots greater than one. and thus 
produces a divergent time path. As far as the values of a and y are concerned, however, the 
question of convergence and divergence only hinges on whether try < I or ay > 1. This 
information is summarized in the top part ofTable 18.1. where the convergent subcase is 
labeled 1C, and the divergent subcase 1D. 

The analysis of Case 2. with repeated roots, is similar in nature. The roots are now 
b = y( 1 + u)/2, with a positive sign because a and y are positive. Thus, there is again no 
oscillation. This time we may classify the value of b into three possibilities only: 

(t'f) 0 < b < 1 => y < \ :<yy < I 

(ui'f) b = 1 y = 1 

(I'/ii) b > 1 =s y < 1: ay > 1 

Under Possibility vi,b(= b[ = bi) is a positive fraction; thus the implications regarding a 
and y arc entirely identical with those of Possibility i under Case 1. In an analogous manner, 
Possibility viii. with b (= b\ = hi) greater than one, can satisfy (18.17) onlv if 1 < b < 2; 
if so, it yields the same results as Possibility i. On the other hand, Possibility W/ violates 
(18.17) and must be ruied out. Thus there are again only two admissible subcases. The 
first—Possibility vi — yields a convergent time path, whereas the other — Possibility viii 
gives a divergent one. In terms of a and y, the convergent and divergent subcases arc again 
associated, respectively, with ay < 1 and ay > 1. These results are listed in the middle pari 
ofTable 18.1, where the two subcases are labeled 2C (convergent) and 2D (divergenl). 


TABLE 18.1 

Cases and 
Subcases of the 
Samuelson 
Model 


Case 

Subcase 

Values of 
a and y 

Time Path T ( 

1. Distinct real roots 

4a 

Y (1+h) 2 

1C:G<£>2 <bi <1 

tty < 1 

Nonoscillatory and 

ID: 1 <bi < &i 

ay > 1 

nonfluctuating 

2. Repeated real roots 

4a 

2C: 0 < b < 1 

ay < 1 

Nonoscillatory and 

1 (1 +o) 2 

2D: f> > 1 

ay > 1 

nonfluctuating 

3. Complex roots 

4a 

3C: fi < 1 

ay < 1 

With stepped 

} k (1 +a) 2 

3D: R > 1 

ay > 1 

fluctuation 



S80 Part Five DvnanttCAM />\vo 


Example 1 


Example 2 


Finally, in Case 3, with complex roots, we have stepped fluctuation, and hence endoge¬ 
nous business cycles. In this case, we should look to the absolute value R — jai 
[see (18.8)] for the due to convergence and divergence, where a 2 is the coefficient of they, 
term in the difference equation (18.1), In the present model, we have R = jay. which 
gives nse to the following three possibilities: 


(Lx) 

R < 1 

=> 

ay 

< i 

lx) 

R = 1 


ay 

= i 

(xi) 

R > 1 

=> 

ay 

> i 


Even though all of these happen to be admissible (see Exercise 18.2-4), only the R < 1 
possibility entails a convergent time path and qualities as Subcase 3C in Table 18.1. The 
other two are thus collectively labeled as Subcase 3D. 

In sum, we may conclude from Table 18.1 that a convergent lime path can occur if and 
only if ay < 1. 

A Graphical Summary 

The preceding analysis has resulted in a somewhat complex classification of cases and 
subcases. It would help to have a visual representation of theclassilicatory scheme. This is 
supplied in Fig. 18.2. 

The set of all admissible (a, y) pairs in the model is shown in Fig. 18.2 by the variously 
shaded rectangular area. Since the values of y - 0 and y = 1 are excluded, as is the value 
a = 0, the shaded area is a son of rectangle without sides. We have already graphed the 
equation y = 4a/(l + a) 2 to mark off the three major cases of Table 18.1: The points on 
that curve pertain to Case 2: the points lying to the north of the curve (representing higher y 
values) belong to Case 1; those Lying to the south (with lower y values) arc of Case 3. To 
distinguish between the convergent and divergent subcases, we now add the graph ofay = 1 
(a rectangular hyperbola) as another demarcation line. The points lying to the north of this 
rectangular hyperbola satisfy the inequality ay > 1. whereas those located below it corre¬ 
spond to ay < 1. It is then possible to mark off the subcases easily. Under Case 1. the broken- 
line shaded region, being below the hyperbola, corresponds to Subcase IC, but the solid-line 
shaded region is associated with Subcase 1D, Under Case 2, which relates to the points lying 
on the curve y = 4«/( 1 + a) 2 . Subcase 2C covers the upward-sloping portion of that curve, 
and Subcase 2D, the downward-slopingportion. Finally, for Case 3, the rectangular hyperbola 
serves to separate the dot-shaded region (Subcase 3C.) from the pebble-shaded region 
(Subcase 3D). The latter, you should note, also includes the points located on the rectangular 
hyperbola itself, because of the weak inequality in the specification ay > 1. 

Since Fig. 18.2 is the repository of all the qualitative conclusions of (he model, given 
any ordered pair (a, y), we can always find the correct subcase graphically by plotting the 
ordered pair in the diagram. 

If the accelerator is 0.8 and the marginal propensity to consume is 0.7, what kind of inter¬ 
action time path will result? The ordered pair (0.8, 0.7) is located in the dot-shaded region, 
Subcase 3C; thus the time path is characterized by damped stepped fluctuation. 

What kind of interaction is implied by a = 2 and y = 0.5? The ordered pair (2, 0.5) lies ex¬ 
actly on the rectangular hyperbola, under Subcase 3D, The time path of Y will again display 
stepped fluctuation, but it will be neither explosive ngr damped. By analogy to the cases of 
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uniform oscillation and uniform fluctuation, we may term this situation as "uniform stepped 
fluctuation." However, the uniformity feature in this latter case cannot in general be expected 
to be a perfect one, because, similarly to what was done in Fig. 18.1, we can only accept 
those points on a sine or cosine curve that correspond to integer values of t, but these values 
of f may hit an entirely different set of points on the curve in each period of fluctuation. 


EXERCISE 18.2 

1 . By consulting Fig. 18.2, find the subcases to which the following sets of values of a and 
y pertain, and describe the interaction time path qualitatively. 

(a) a = 3.5; y = 0.8 (c) a = 0.2; y = 0.9 

(b) a = 2; y = 0.7 (d) « = 1,5; y = 0.6 

2. From the values of a and y given in parts (a) and (c) of Prob. 1, find the numerical val¬ 
ues of the characteristic roots in each instance, and analyze the nature of the time path, 
Do your results check with those obtained earlier? 

3. Verily that Possibilities ii, Hi, and iv in Case 1 imply inadmissible values of y. 

4. Show that in Case 3 we can never encounter y > 1, 

18.3 Inflation and Unemployment in Discrete Time _ 

The interaction of inflation and unemployment, discussed earlier in the continuous-time 
framework, can also be couched in discrete time. Using essentially the same economic 
assumptions, we shall illustrate in this section how that model can be reformulated as a 
difference-equation model. 

The Model 

The earlier continuous-time formulation (Sec. 16.5) consisted of three differential 
equations: 

p = a -T - (}U + gn [expectations-augmcnted 

Phillips relation] (16.33) 

dn 

— = jip - ;t) [adaptive expectations] (16.34) 

di' 

— = -k(m - p) [monetary policy] (16.35) 

at 

Three endogenous variables are present: p (actual rate of inflation), n (expected rate of 
inflation), and U (rate of unemployment), As many as six parameters appear in the model; 
among these, the parameter m —the rate of growth of nominal money (or, the rate of mon¬ 
etary expansion)—differs from the others in that its magnitude is set as a policy decision. 
When cast into the period-analysis mold, the Phillips relation (16.33) simply becomes 

Pl =a-T-pU,+gn t (aj>0:0<g<\) (18.18) 

In the adaptive-expectations equation, the derivative must be replaced by a difference 
expression: 


jr r+ i - it, =j{p, -n,) (0 <./<!) 


(18.19) 
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By the same token, the monetary-policy equation should be changed to f 

U t +\-V t = -k{m -/>,+,) (A => 0) (18.20) 

These three equations constitute the new version of the inflation-unemployment model. 

The Difference Equation in p 

As the first step in the analysis of this new model, we again try' to condense the model into 
a single equation in a single variable. Let that variable be p■ Accordingly, we shall focus 
our attention on (18.18). However, since (18.18)—unlike the other two equations does 
not by itself describe a pattern of change, it is up to us to create such a pattern. This is 
accomplished by differencing p„ i.c.. by taking the first difference of p,, according to the 
definition 


A p f = p ,-1 - p, 

Two steps arc involved in this. First, we shift the time subscripts in (18.18) forward one 
period, to get 

p l + i =a-T-pU l \ ] +gn l+] (18.18') 

Then wc subtract (18.18) from (18.18'), to obtain the first difference of/y that gives the 
desired pattern of change: 

Pi I 1 - Pt = -$(£4+1 - Ut) + g(JT(+l — 7T{) 

= fik{m - p t+ 1 ) + gj{ Pl - tv,) [by (18.20) and (18.19)] (18.21 ) 

Note that, on the second line of (18.21), the patterns of change of the other two variables as 
given in (18.19) and (18.20) have been incorporated into the pattern of change of the/; vari¬ 
able. Thus (18.21) now embodies all the information in the present model. 

However, the it, term is extraneous to the study of p and needs to be eliminated from 
(18.21). To that end, we make use of the fact that 

gn, = p t -{or - T) + fiUt [ b y (18-18)] (18.22) 

Substituting this into (18,21) and collecting terms, wc obtain 

(I +fik)p,+ 1 - [1 - j{\ -g)\p,+jpUt = + /(« - 4) (18.23) 

But there now' appears a U t term to be eliminated. To do that, we difference (18.23) to get 
a (C /,+1 - U() term and then use (18.20) to eliminate the latter. Only after this rather 
lengthy process of substitutions, do we get the desired difference equation in the p variable 
alone, which, when duly normalized, takes the form 


Pt-2 


+ £/ + (!-7)(1 + M) 
\+ffc 


Pm 


1-/0 ~g) 

\+pk 


Pi = 


jjikm 
I + fik 


(18.24) 


Hi 


f We have assumed that the change in U, depends on (m - p t +i), the rate ol growth of real money 
in period (t + 1). As an alternative, it is possible to make it depend on the rate of growth of real 
money in period f, (m- p t ) (see Exercise 18.3-4). 
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The Time Path of p 

The intertemporal equilibrium value of p , given by the particular integral of (18.24), is 


P ~ 


c 

I +tf] + Cl 2 


jftkm 


= m 


[by (18.2)] 


As in the continuous-time model, therefore, the equilibrium rate of inflation is exactly 
equal to the rate of monetary expansion. 

As to the complementary function, there may arise either distinct real roots (Case 1), or 
repeated real roots (Case 2), or complex roots (Case 3), depending on the relative magni¬ 
tudes of a\ and $ai. In the present model, 

iff [\+8j+(\-j)o+m 2 

= 4[1-/U-£)K1+W (18.25) 

If# = = j and fik = 5, for instance, then a 2 = (5|) 2 whereas 4 a 2 = 20; thus Case 1 

results. But if g = / = 1, then of = 4 while 4 a 2 — 4(1 + f>k) > 4, and we have Case 3 
instead. In view ol'lhe larger number of parameters in the present model, however, it is not 
feasible to construct a classificatory graph like Fig. 18.2 in the Samuelson model. 

Nevertheless, the analysis of convergence can still proceed along the same line as in 
Sec. 18.2. Specifically, we recall from (16.6) that the two characteristic roots b\ and 6; must 
satisfy the following two relations: 


il+ fc=- ol =!±! + i-,> o 

bib 2 =a 2 = 1 ~ /(l ~ s) e (0. 1) 

1 ‘ 2 1 +fik 

Furthermore, we have in the present model 


(18.26) 

[see (18.24)] 

(18.26') 


(1 -6j)(l -b 2 ) = 1 ~(b] + b 2 ) + b l b 2 = 


m 

1 +(3k 


> 0 


(18.27) 


Now consider Case 1, where the two roots b\ and b 2 are real and distinct. Since their 
product />|Z >2 is positive, b\ and b 2 must take the same sign. Because their sum is positive, 
moreover, b\ and b 2 must both be positive, implying that no oscillation can occur. 
From (18.27), w ; e can infer that neither b\ nor b 2 can be equal to one; for otherwise 
(1 - 6|){1 — b 2 ) would be zero, in violation of the indicated inequality. This means that, in 
terms of the various possibilities of (fr|, b 2 ) combinations enumerated in the Samuelson 
model. Possibilities ii and iv cannot arise here. It is also unacceptable to have one root 
greater, and the other root less, than one; for otherwise! I - b\)(\ -b 2 ) w'ould be negative. 
Thus Possibility iii is ruled out as well. It follows that b\ and b 2 must be either both greater 
than one, or both less than one. If h\ > 1 and b 2 > 1 (Possibility i/j, however. (18.26') 
would be violated. Hence the only viable eventuality is Possibility i. with b\ and b 2 both 
being positive fractions, so that the time path of p is convergent. 

The analysis of Case 2 is basically not much different. By practically identical reason¬ 
ing, we can conclude that the repealed root b can only turn out to be a positive fraction in 
this model; that is, Possibility vi is feasible, but not Possibilities vii and viii. The time path 
of p in Case 2 is again nonoscillatory and convergent. 
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For Case 3, convergence requires that R (the absolute value of the complex roots) be less 
than one. By (IS.8), R = jai. Inasmuch as a 2 is a positive fraction [see (18.26')], we do 
have R < I. Thus the time path of p in Case 3 is also convergent, although this time there 
will be stepped fluctuation. 

The Analysis of U 

If we wish to analyze instead the lime path of the rate of unemployment, wc may lake 
(18.20) as the point of departure. To get rid of the p term in that equation, wc first substi¬ 
tute (18.18') to get 

(I +pk)U,+ 1 - U, = h(a - T-m) + kgn n] (18.28) 

Next, to prepare for the substitution of the other equation, (18.19). wc di flerence (18.28) to 
find that 

(1 4- flk)U l+1 - (2 + fik)U t+ x + U, = kg(7t^ 2 - x t+] ) (18.29) 

In view of the presence of a difference expression in a- on the right, wc can substitute for it 
a forward-shifted version of the adaptive-expectations equation. The result of this, 

(1 +pk)U l+ 2-(2 + pk)U,+\ + U, =kgj(p t+ 1 -?r, + i) (18.30) 

is the embodiment of all the information in the model. 

However, we must eliminate the p and n variables before a proper difference equation 
in V will emerge. For this purpose, we note from (18.20) that 

kp,+\ = U,+i ~ U t +km (18.31) 

Moreover, by multiplying (18.22) through by {-kj) and shifting the time subscripts, we 
can write 

-kjgni-] = ~kjpt\ i + kjia - T) - ftkjlh , i 

- -jiPi 11 - U, + km) + kj{a - T) — pkjU, +l 

[by (18.31)] 

= -j{\ + pk)V t+] + jU, + kjia - T-m) (18.32) 

These two results express p, +l and 7,+i in terms of the U variable and can thus enable us. 
on substitution into (18.30), to obtain—at long last!—the desired difference equation in the 
U variable alone: 


Or-| H- U< 

1 +0k ' \+fi k 

_ kj[a - T - (1 -g)m] 
1 +fik 


(18.33) 


It is noteworthy that the two constant coefficients on the left fti| and a 2 ) are identical 
with those in the difference equation for p [i.c., (18.24)]. As a result, the earlier analysis of 
the complementary function of the p path should be equally applicable to the present con¬ 
text. But the constant term on the right of (18.33) does differ from that of (18.24). Conse¬ 
quently. the particular solutions in the two situations will be different. This is as it should 
be. for, coincidence aside, there is no inherent reason to expect the intertemporal equilib¬ 
rium rate of unemployment to be the same as the equilibrium rate of inflation. 
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The Long-Run Phillips Relation 

It is readily verified that the intertemporal equilibrium rate of unemployment is 


U = -l«- T 


(i-sW 


But since the equilibrium rate of inflation has been found to be p = m, we can link V to p 
by the equation 


U=-[cr-T-{l-g)p] (18.34) 

P 

Because this equation is concerned only with the equilibrium rates of unemployment and 
inflation, it is said to depict the long-run Phillips relation. 

A special ease of (18.34) has received a great deal of attention among economists: the 
case of g = 1. If £ = 1, the ~p term will have a zero coefficient and thus drop out of the pic¬ 
ture. In other words, U will become a constant function of p. In the standard Phillips dia¬ 
gram, where the rate of unemployment is plotted on the horizontal axis, this outcome gives 
rise to a vertical long-run Phillips curve. The U value in this case, referred to as the natural 
rate of unemployment, is then consistent with any equilibrium rate of inflation, with the no¬ 
table policy implication that, in the long run, there is no trade-off between the twin evils of 
inflation and unemployment as exists in the short run. 

But what ifg < 1? In that event, the coefficient of/Tin (18.34) will be negative. Then the 
long-run Phillips curve will turn out to be downward-sloping, thereby still providing a trade¬ 
off relation between inflation and unemployment. Whether the long-run Phillips curve is 
vertical or negatively sloped is, therefore, critically dependent on the value of the £ parame¬ 
ter. which, according to the expectations-augmcntcd Phillips relation, measures the extent to 
which the expected rale of inflation can work its way into the wage structure and the actual 
rate of inflation. All of this may sound familiar to you. This is because we discussed the topic 
in Example 1 in Sec. 16 . 5 , and you have also worked on it in Exercise 16 . 5 - 4 . 


EXERCISE 183 

1. Supply the intermediate steps leading from (18.23) to (18.24). 

2. Show that if The model discussed in this section is condensed into a difference equation 
in the variable n, the result will be the same as (18.24) except for the substitution of it 
for p. 

3. The time paths of p and U in the model discussed in this section have been found to be 
consistently convergent. Can divergent time paths arise if we drop the assumption that 
g< 1? If yes, which divergent "possibilities" in Cases 1, 2, and 3 will now become 
feasible? 

4. Retain equations (18.18) and (18.19), but change (18.20) to 

U;_, -U, = -k{m-p t ) 

(o) Derive a new difference equation in the variable p. 

(b) Does the new difference equation yield a different p? 

(c) Assume that / = g = 1. Find the conditions under which the characteristic roots 
will fall under Cases 1, 2, and 3, respectively, 

(d) Let j = g = 1. Describe the time path of p (including convergence or divergence) 
when fik - 3, 4, and 5, respectively. 
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18.4 Generalizations to Variable-Term and 

Higher-Order Equations ___ 

We are now ready to extend our methods in two directions, to the variable-term case and to 
difference equations of higher orders. 

Variable Term in the Form of cm' 

When the constant term ein (18.1) is replaced by a variable term—some function of t -the 
only effect will be on the particular solution. (Why?) To find the new particular solution, we 
can again apply the method of undetermined coefficients. In the differential-equation con¬ 
text (Sec. 16.6), that method requires that the variable term and its successive derivatives 
together take only a finite number of distinct types of expression, apart from multiplicative 
constants. Applied to difference equations, the requirement should be amended to read: "the 
variable term and its successive differences must together take only a finite number of dis¬ 
tinct expression types, apart from multiplicative constants.” Let us illustrate this method by 
concrete examples, first taking a variable term in the form cm', where c and m are constants. 


Example 1 


Find the particular solution of 

ft-2 + ft+i - 3ft = 7' 

Here, we have c = 1 and m=7. First, let us ascertain whether the variable term 7 s yields a 
finite number of expression types on successive differencing. According to the rule of 
differencing (Ay t = y t+ i - y ( ), the first difference of the term is 

A7 l = 7 1 -' -7' = (7-1)7 ! = 6(7)' 

Similarly, the second difference, A 2 (7 ! ), can be expressed as 

A(A7') = A6(7 f ) = 6(7)' 11 - 6(7)' = 6(7 - 1)7' = 36(7) ! 

Moreover, as can be verified, all successive differences will, like the first and second, be 
some multiple of 7‘. Since there is only a single expression type, we can try a solution 
y ( = 6(7)' for the particular solution, where B is an undetermined coefficient. 

Substituting the trial solution and its corresponding versions for periods (t + 1) and 
(t 4- 2) into the given difference equation, we obtain 

B(7)‘ 12 + B(7)' -1 - 3B(7) ! = 7' or B(7 2 + 7 - 3)(7)' = 7' 

Thus, 

n_ 1 _ 1 

49 + 7- 3 53 

and we can write the particular solution as 

yp = s(7 )! = ^ )I 

This may be taken as a moving equilibrium. You can verify the correctness of the solution 
by substituting it into the difference equation and seeing to it that there will result an iden¬ 
tity, 7' = 7'. 


The result reached in Example 1 can be easily generalized from the variable term 7' to 
that of cm 1 . From our experience, we expect all the successive differences of cm' to take the 



Chapter “18 Higher-Order Difference Equations 587 


same form of expression: namely, Bm ', where 8 is some multiplicative constant. Hence we 
can try a solution y r = Bm 1 for the particular solution, when given the difference equation 

}’i+i + a \yt+\ +tf 2 y, =cm' (18.35) 

Using the trial solution y, = Bm 1 , which implies y, +5 = Bm ,+l , etc., we can rewrite equa¬ 
tion (18.35) as 

Em 1 * 2 +a\Bm' + ' +aiBm' = cm 1 
or B{m 2 + a\tn + a-^nt 1 = cm' 

Hence the coefficient B in the trial solution should be 

B = — --- 

m L + a\m + ai 

and the desired particular solution of (18.35) can be written as 

y p = Bm' = —: - m‘ {m 2 + a]m 02 0) (18.36) 

m L + a\m + a 2 

Note that the denominator of B is not allowed to be zero. If it happens to be,* we must 
then use the trial solution - Btm 1 instead; or, if that too fails, v, = Bt 2 m'. 

Variable Term in the Form of ct n 

Let us now consider variable terms in the form cf , where c is any constant, and n is a 
positive integer. 


Example 2 


Find the particular solution of 

y ,^2 +5>vi + 2y t = f 2 

The first three differences of t 2 (a special case of ct n with c ■= 1 and n = 2) are found as 
follows:* 

Af 2 = (t + 1 ) 2 — f 2 = 2t +1 
A 2 f 2 = A(At 2 ) = A(2t + 1) = A2t + A1 

= 2(t +1) - 2t + 0 = 2 [A constant = 0] 

A 3 t 2 = A(A 2 f 2 ) = A2 = 0 


Since further differencing will only yield zero, there are altogether three distinct types of 
expression: f 2 (from the variable term itself), t, and a constant (from the successive 
differences). 

Let us therefore try the solution 

yt=flo + fl 1 H-B 2 t 2 


1 Analogous to the situation in Example 3 of Sec. 16.6, this eventuality will materialize when the 
constant m happens to be equal to a characteristic root of the difference equation. The characteristic 
roots of the difference equation of (18.35) are the values of b that satisfy the equation b 2 + a^b » 

02 = 0. If one root happens to have the value m, then it must follow that m 2 -\-Q\m + a 2 =0. 

1 These results should be compared with the first three derivatives of t 


2 . 


dt l 


= 2r 


dt 2 


t 2 =2 


and 


dt 3 
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for the particular solution, with undetermined coefficients 61 , and 82 * Note that this 
solution implies 

Yt- 1 = 80 4 - 61 (t + 1 ) + Bi(t 4 -1 ) 2 

= { 80+81 +S 2 ) + (8i + 28 2 )f + B 2 t 2 
Yl+2 - 80 + 81 (t + 2) + 8;(t + if 

= (B 0 + 28, +48i) + ( 8 i +48 2 )ff B 2 t 2 
When these are substituted into the difference equation, we obtain 
(880 + 78! +98 2 ) + (88i + 148 2 )f + 8 B 2 ? 2 = t 2 

Equating the two sides term by term, we see that the undetermined coefficients are 
required to satisfy the following simultaneous equations: 

880 + 78,+ 98 2 = 0 
88 1 +148 2 = 0 

882 = 1 

Thus, their values must be 8 0 = Si = and $ 2 = giving us the particular 
solution 

13 7 1, 

yp ~ 256 32 t+ 8 

Our experience with the variable term t 2 should enable us to generalize the method to 
the case of ct n . In the new trial solution, there should obviously be a term S n t n , to corre¬ 
spond to the given variable term. Furthermore, since successive differencing of the term 

yields the distinct expressions t n ~\ t n ~ 2 , _ t, and Bq (constant), the new trial solution for 

the case of the variable term cf n should be written as 

y t = Bq + Si t + B 2 t 2 + • • • + B n t n 

But the rest of the procedure is entirely the same. 

It must be added that such a trial solution may also fail to work. In that event, the trick— 
already employed on countless other occasions—is again to multiply the original trial 
solution by a sufficiently high power of t. That is, we can instead try yt = f(So + 81 (+ 
B 2 t 2 -i- h B„t n ), etc. 

Higher-Order Linear Difference Equations 

The order of a difference equation indicates the highest-ordcr difference present in the 
equation; but it also indicates the maximum number of periods of time lag involved. An 
nth-order linear difference equation (with constant coefficients and constant term) may 
thus be written in general as 

_v, + „ +a\y l+n 1 +-b a„-\y l+ { + a„y t = c ( 18 . 37 ) 

The method oflmding the particular solution of this does not differ in any substantive 
way. As a starter, we can still try y, = k (the case of stationary intertemporal equilibrium). 
Should this fail, we then try y, = kt or y, = kt 2 , etc., in that order. 

In the search for the complementary function, however, we shall now be confronted with 
a characteristic equation which is an nth-degree polynomial equation: 

b n + ct\ b" * + • • • + Of), \b + = 0 


( 18 . 38 ) 
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There will now be n characteristic roots b f (i = 1, 2,.... n), all of which should enter into 
the complementary function thus: 

(18-39) 

(-1 

provided, of course, that the roots arc all real and distinct, in case there are repeated real roots 
(say, hi =bi= K), then the first three terms in the sum in (18.39) must bemodiiied to 

A\h\ + A 2 th\ + A : ,t 2 b\ [cf. (18.6)] 

Moreover, if there is a pair of conjugate complex roots- say, —then the last two 

terms in the sum in (18.39) are to be combined into the expression 

R'(A„-\ cos,0t + A„ sin^r) 

A similar expression can also be assigned to any other pair of complex roots. In ease of two 
repeated pairs, however, One of the two must be given a multiplicative factor of t R' instead 
off?'. 

After y p and y L . arc both found, the general solution of the complete difference equation 
(18.37) is again obtained by summing; that is, 

y, = yp + y c 

But since there will be a total of n arbitrary constants in this solution, no less than n initial 
conditions will be required to definitize it. 


Example 3 


Find the general solution of the third-order difference equation 


7 1 

yr-t3 - g yt+i + g yt+i + 


32 


n = 9 


By trying the solution y, = k, the particular solution is easily found to be y p = 32. As for the 
complementary function, since the cubic characteristic equation 


9 




1 

32 


= 0 


can be factored into the form 


the roots are b i = b 2 



j and f ?3 = -g. This enables us to write 


Note that the second term contains a multiplicative f; this is due to the presence of repeated 
roots. The general solution of the given difference equation is then simply the sum of y c and y p . 

In this example, all three characteristic roots happen to be less than 1 in their absolute 
values. We can therefore conclude that the solution obtained represents a time path which 
converges to the stationary equilibrium level 32. 


Convergence and the Schur Theorem 

When we have a high-order difference equation that is not easily solved, we can nonethe¬ 
less determine the convergence of the relevant time path qualitatively without having to 
struggle with its actual quantitative solution. You will recall that the time path can converge 
if and only if every root of the characteristic equation is less than I in absolute value. 
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In view of this, the following theorem—known as the Schur theorem'' —becomes directly 
applicable: 

The roots of the nth-degree polynomial equation 

anh" + a\b" 1 -I- • • • T- a n -\ h + a n = 0 

will all be less than unity in absolute value if and only if the following n determinants 
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arc all positive. 

Note that, since the condition in the theorem is given on the “if and only if" basis, it is 
a necessary-and-sufficient condition. Thus the Schur theorem is a perfect difference- 
equation counterpart of the Routh theorem introduced earlier in the differential-equation 
framework. 

The construction of these determinants is based on a simple procedure. This is best 
explained with the aid of the dashed lines which partition each determinant into four ureas. 
Each area of the kth determinant, A i; . always consists of a k xk subdeterminant. The 
upper-left area has a () alone in the diagonal, zeros above the diagonal, and progressively 
larger subscripts for the successive coefficients in each column below the diagonal ele¬ 
ments. When we transpose the elements of die upper-left area, we obtain the lower-right 
area. Turning to the upper-right area, we now place the a n coefficient alone in the diagonal, 
with zeros below the diagonal, and progressively smaller subscripts for the successive 
coefficients as we go up each column from the diagonal. When the elements of this area are 
transposed, we get the lower-left area. 

The application of this theorem is straightforward. Since the coefficients of the charac¬ 
teristic equation are the same as diose appearing on the left side of the original difference 
equation, we can introduce them directly into the determinants cited. Note that, in our 
context, wo always have a (l = 1. 


Example 4 


Does the time path of the equation y,- 2 -t- 3 4 - 2y t = 12 converge? Here we have n = 2, 
and the coefficients are Oq = 1 , 01 =3, and 02 = 2. Thus we get 


Ai 



02 


1 2 

02 

Go 


2 1 


t Fora discussion of this theorem and its history, see John 5. Chipman, The Theory of Inter-Sectoral 
Money Flows and Income Formation, The |ohns Hopkins Press, Baltimore, 1951, pp. 119-120. 
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Since this already violates the convergence condition, there is no need to proceed to Aj. 

Actually, the characteristic roots of the given difference equation are easily found to be 
b\, fc = -1, -2, which indeed imply a divergent time path. 


Example 5 


Test the convergence of the path of y n 2 -f l y f .i - \y( = 2 by the Schur theorem. Here the 
coefficients are Oo = 1 , ai = g, 02 = - 5 (with n = 2 ). Thus we have 
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These do satisfy the necessary-and-sufficient condition for convergence. 


EXERCISE 18.4 

1. Apply the definition of the "differencing" symbol A, to find: 

( 0 ) At (b) A 2 t (c) At 3 

Compare the results of differencing with those of differentiation. 

2. Find the particular solution of each of the following: 

(a) y,i 2 +2yi.i +y ( = 3' 

(b) Kf+j — 5/f+i - 6y ( = 2(6)' 

(c) 3 y (T 2 + 9y, = 3(4)' 

3. Find the particular solutions of: 

(o) y^ 2 ~ 2/jxi + 5y f = f 

(b) y, T 2 - 2y,,i + 5y, = 4 -r 2t 

(c) y u2 +5y-_i + 2 y, = 18 + 6 t + 8 t 2 

4. Would you expect that, when the variable term takes the form m 1 + t n , the trial 
solution should be S(m ) 1 + (Bo + fii t H— + B„t n )7 Why? 

5. Find the characteristic roots and the complementary function of: 

(o) Yt+ 3 - 2W-2 — Ki+1 + 2^ = 0 

(b) y (+3 - 2yt_ 2 + fy^i - \y, = 1 

[Hint: Try factoring out (b- \) in both characteristic equations.] 

6 . Test the convergence of the solutions of the following difference equations by the 
Schur theorem: 

(a) Yi +2 + 2 Ki+i — 5 /( — 3 

(b) Y;+2 p yt — 1 

7. In the case of a third-order difference equation 
Yt+i °i yt +2 + 02 yt+t + 03 Yt ~ c 

what are the exact forms of the determinants required by the Schur theorem? 
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Simultaneous Differential 
Equations and Difference 
Equations 


Heretofore, our discussion of economic dynamics has been confined to the analysis of a 
single dynamic (differential or difference) equation, In the present chapter, methods for 
analyzing a system of simultaneous dynamic equations are introduced. Because this would 
entail the handling of several variables at the same time, you might anticipate a great deal 
of new complications. But the truth is that much of what wc have already learned about 
single dynamic equations can be readily extended to systems of simultaneous dynamic 
equations. For instance, the solution of a dynamic system would still consist of a set of 
particular integrals or particular solutions (intertemporal equilibrium values of the various 
variables) and complementary functions (deviations from equilibriums). The complemen¬ 
tary functions would still be based on the reduced equations, i.e., the homogeneous versions 
of the equations in the system. And the dynamic stability of the system would still depend 
on the signs (if differential equation system) or the absolute values (if difference equation 
system) of the characteristic roots in the complementary funclions. Thus the problem of a 
dynamic system is only slightly more complicated than that of a single dynamic equation. 

19,1 The Genesis of Dynamic Systems __ 

There are two general ways in which a dynamic system can come into being. It may em¬ 
anate from a given set of interacting patterns of change. Or it may be derived from a single 
given pattern of change, provided the latter consists of a dynamic equation of the second 
(or higher) order. 

Interacting Patterns of Change 

The most obvious case of a given set of interacting patterns of change is that of a multisec¬ 
tor model where each sector, as described by a dynamic equation, impinges on at least one 
of the other sectors. A dynamic version of the input-output model, for example, could in¬ 
volve n industries whose output changes produce dynamic repercussions on the other in¬ 
dustries. Thus it constitutes a dynamic system. Similarly a dynamic general-equilibrium 
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market model would involve n commodities that are interrelated in their price adjustments. 
Thus, there is again a dynamic system. 

However, interacting patterns of change can be found even in a single-sector model. The 
various variables in such a model represent, not different sectors or different commodities, 
but different aspects of an economy. Nonetheless, they can affect one another in their 
dynamic behavior, so as to provide a network of interactions/ A concrete example of this has 
in fact been encountered in Chap. 18. In the inflation-unemployment model, the expected 
rate of inflation n follows a pattern of change, (18.19), that depends not only on n , but also 
on die rate of unemployment U (through the actual rate of inflation p). Reciprocally, the 
pattern of change of U, (18.20), is dependent on n (again through p). Thus the dynamics 
of n and V must be simultaneously determined. In retrospect, therefore, the inflation- 
unemployment model could have been treated as a simultaneous-equation dynamic model. 
And that would have obviated the long sequence of substitutions and eliminations that were 
undertaken to condense the model into a single equation in one variable. Below, in Sec. 19.4, 
we shall indeed rework that model, viewed as a dynamic system. Meanwhile, the notion that 
the same model can be analyzed either as a single equation or as an equation system supplies 
a natural cue to the discussion of the second way to have a dynamic system. 

The Transformation of a High-Order Dynamic Equation 

Suppose that we are given an nth-order differential (or difference) equation in one variable. 
Then, as will be shown, it is always possible to transform that equation into a mathemati¬ 
cally equivalent system of n simultaneous_/?>5f-order differential (or difference) equations 
in n variables. In particular, a second-order differential equation can be rewritten as Two 
simultaneous first-order differential equations in two variables/ Thus, even if we happen to 
start out with only one (high-oTdeT) dynamic equation, a dynamic system can nevertheless 
be derived through the artifice of mathematical transformation. This fact, incidentally, has 
an important implication: In the ensuing discussion of dynamic systems, we need only be 
concerned with systems of first-order equations, for if a higher-order equation is present, 
we can always transform it first into a set of first-order equations. This will result in a larger 
number of equations in the system, but the order will then be lowered to the minimum. 

To illustrate the transformation procedure, lot us consider the single difference equation 

y l+ 2 +a iy,+i + a 2 y, = c (19.1) 

If we concoct an artificial new variable x t , defined by 

x, = y l+] (implying = y,+ 2 ) 

we can then express the original second-order equation by means of two first-order (one- 
period lag) simultaneous equations as follows: 

■*nj + a l x l +a 2 y,=c 
V/+i - x t - 0 

f Note that if we have two dynamic equations in the two variables y-\ and yz such that the pattern 
of change of yi depends exclusively on y-\ itself, and similarly for yz, we really do not have a 
simultaneous-equation system. Instead, we have merely two separate dynamic equations, each of 
which can be analyzed by itself, with no requirement of "simultaneity/' 

* Conversely, two first-order differential (or difference) equations in two variables can be consolidated 
into a single second-order equation in one variable, as we did in Secs. 16.5 and 18.3. 
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It is easily seen that, as long as the second equation (which dclines the variable .r f ) is satis¬ 
fied, the first is identical with the original given equation. By a similar procedure, and using 
more artificial variables, we can similarly transform a higher-order single equation into an 
equivalent system of simultaneous first-order equations. You can verify for instance, that 
the third-order equation 

>’(*3 + >7+2 “ -’13 + 1 + 2>’( = 0 (19.2) 

cun be expressed as 

h- [+ i 4- w, - 3.t, + 2y, = 0 

J, +1 - w t =0 (19.2') 

>7+1 - X, =0 

where x, = y, + i (so that x,_i = >’,+ 2 ) and w, = x,+] (so that w t+ 1 = x ;i 2 = >’/+.!)• 

By a perfectly similar procedure, we can also transform an nth-order differential equa¬ 
tion into a system of n first-order equations. Given the second-order differential equation 

At)+«iyXt)+a 2 y(t) = Q (19.3) 

for instance, we can introduce anew variable x(r), defined by 

Jt(r) = y'{t) [implying x'(r) = y"(r)] 

Then (19.3) can be rewritten as the following system of two first-order equations; 


x'(i) + a\x{t) + aiyft) = 0 

y'(t) - x(r) = 0 


(19.3') 


where, you may note, the second equation performs the function of defining the newly in¬ 
troduced .v variable, as did the second equation in (19.T). Essentially the same procedure 
can also be used to transform a higher-order differential equation. The only modification is 
that we must introduce a correspondingly larger number of new variables. 


19.2 Solving Simultaneous Dynamic Equations __ 

The methods for solving simultaneous differential equations and simultaneous dilference 
equations are quite similar. We shall thus discuss them together in this section. For our pre¬ 
sent purposes, we shall confine the discussion to linear equations with constant coefficients 
only. 


Simultaneous Difference Equations 

Suppose that we are given the following system of linear difference equations: 


x (+ i + 6 x r + 9 y, — 4 
>'(+, - x, = 0 


(19.4) 


How do we find the time paths of.r andy such that both equations in this system will be sat¬ 
isfied? Essentially, our task is again to seek the particular integrals and complementary 
functions, and sum these to obtain the desired time paths of the two variables. 
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Since particular integrals represent intertemporal equilibrium values, let us denote them 
by x and y. As before, it is advisable first to try constant solutions, namely, x*,_i = x, = x 
and y t -\ = y t = J. This will indeed work in the present case, for upon substituting these 
trial solutions into (19.4) we get 


7x + 9y = 4 
-x+ 7 = 0 


(19.5) 


(In case such constant solutions fail to work, however, we must then try solutions of the 
form x, = kit, v, = kit, etc.) 

For the complementary functions, we should, drawing on our previous experience, adopt 
trial solutions of the form 


x,=mb r and y, = nb‘ ( 19.6) 


where m and n are arbitrary constants and the base b represents the characteristic root. It is 
then automatically implied that 

x,-i=mb t+] and y, +] =nb'^ 1 (19.7) 


Note that, to simplify matters, we are employing the same base b ^ 0 for both variables, 
although their coefficients are allowed to differ, it is our aim to find the values of b. m. and 
n that can make the trial solutions (19.6) satisfy the reduced (homogeneous) version 
of (19.4). 

Upon substituting the trial solutions into the reduced version of (19.4) and canceling the 
common factor h 1 0, we obtain the two equations 


(b + 6 )m + 9n = 0 
—m + bn = 0 


(19.8) 


This can be considered as a linear homogeneous-equation system in the iwo variables m 
and n —if we are willing to consider b as a parameter for the time being. Because the sys¬ 
tem (19.8) is homogeneous, it can yield only the trivial solution m = n = 0 if its coefficient 
matrix is nonsingular (see Table 5.1 in See, 5,5), In that event, the complementary functions 
in (19.6) will both be identically zero, signifying that x and y never deviate from their in¬ 
tertemporal equilibrium values. Since that would be an uninteresting special case, we shall 
try to rule out that trivial solution by requiring the coefficient matrix of the system to be 
singular. That is, we shall require the determinant of that matrix to vanish: 

- *2 + 66 + 9 = 0 (19,9) 

From this quadratic equation, we find that h(=h\ = hi) = -3 is the only value which can 
prevent m and n from both being zero in (19.8), We shall therefore only use this value of b. 
Equation (19.9) is called the characteristic equation, and its roots the characteristic roots, 
of the given simultaneous difference-equation system. 

Once we have a specific value ol7>, (19.8) gives us the corresponding solution values of 
m and n . The system being homogeneous, however, there will actually emerge an infinite 
number of solutions for (i m , n ), expressible in the form of an equation m — kn, where k is a 
constant. In fact, for each root 6,-, there will in general be a distinct equation m i 
Even with repeated roots, with b\ = bi, we should still use two such equations, m \ — k\ n \ 


b + 6 9 
-I b 
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and mi = kirti in the complementary functions. Moreover, with repeated roots, we recall 
from (18.6) that the complementary functions should be written as 

x, = m j (—3) ; + 
yt - «i(-3)' +n 2 t{-y? 

The factors of proportionality between /»,■ and n, must, of course, satisfy the given equa¬ 
tion system (19.4), which mandates that v,. i = x t , i.c., 

«i(—3)' +1 + n 2 (t + n(-3) t_l -mi(-3)' +ih 2 ^-3)' 

Dividing through by (-3)', wc got 

—3/?i - 3n 2 (t + 1) = m\ + m 2 t 


or, after rearranging. 

-3(«i +«2) - 3 h 2 * - m\+m 2 t 

Equating the terms with t on the two sides of the equals sign, and similarly for the terms 
without t. we find 

ftj [ — —3(n i + / 12 ) and m 2 = —3«2 
If we now write n\ = n 2 = ri 4 , then it follows that 

mi = —3(4} + Ai) m 2 = -3ri.j 
Thus the complementary functions can be written as 

x r = —3(Ai -t- ri 4 )(—3) f — 3.44?( 3) 

- -3ri2(-3)' - 3 A& + 1)(—3)' (19.10) 

v, = ri ? (-3) f -Mri(-3V 

where and A 4 are arbitrary constants. Then the general solution follows easily by com¬ 
bining the particular solutions in (19.5) with the complementary functions just found. All 
that remains, then, is to definitize the two arbitrary constants A 3 and A 4 with the help of 
appropriate initial or boundary conditions. 

One significant feature of the preceding solution is that, since both time paths have iden¬ 
tical b' expressions in them, they must either both converge or both diverge. This makes 
sense because, in a model with dynamically interdependent variables, a general intertem¬ 
poral equilibrium cannot prevail unless no dynamic motion is present anywhere in the 
system. In the present case, with repeated roots b = -3. the time paths of bollix and y will 
display explosive oscillation, 

Matrix Notation 

In order to bring out the basic parallelism between the methods of solving a single equation 
and an equation system, the preceding exposition was carried out without the benefit of ma¬ 
trix notation. Let us now' see how the latter can be utilized here. Even though it may seem 
pointless to apply matrix notation to a simple system of only two equations, the possibility 
of extending that notation to the n-equation case should make it a worthwhile exercise. 
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First of all, the given system (19.4) may be expressed as 


1 - 1 

o — 

— o 

•Vl 

+ 

1 - 1 

O' o 

VO — 

i 

i_1 

x t 

J’<. 

— 

V 

0 


or, more succinctly, as 

lu + Kv = d 


(19.4') 


(19.4") 


where / is the 2 x 2 identity matrix: K is the 2 x 2 matrix of the coefficients of the x ( and 
y, terms; and v, and d are column vectors defined as follows: 1 


u = 


*/+] 


V/+1 





d 


4 

0 


The reader may find one feature puzzling: Since we know lu = u. why not drop the /? The 
answer is that, even though it seems redundant now, the identity matrix will be needed in 
subsequent operations, and therefore we shall retain it as in (19.4"). 

When we try constant solutions x, n = x, = x and y^\ = » = y for the particular 


solutions, we are in effect selling u = v = 


; this will reduce (19.4") to 


(/+ AT) 


= d 


If the inverse (/ 4- K) 1 exists, we can express the particular solutions as 


= (I + K)-'d 


(19.5') 


This is of course a general formula, for il is valid for any matrix K and vector d as long as 
(/ + K)~ ] exists. Applied to our numerical example, we have 


(/ + *) 


U = 

7 9 ' 

-1 

'4' 


' 1 

16 

j 

1 


'4' 


1 ‘ 
4 

-i 1 


0 


l 

7 


0 


1 


• - 


- - 


- I* 

16 _ 


- - 


_ 4 . 


Therefore, x = y = which checks with (19.5). 

Turning to the complementary functions, wc see that the trial solutions (19.6) and (19.7) 
give the u and v vectors the specific forms 


u = 


'mb ,+] ~ 


m 

. nb ‘ +X . 


n 


and 


'mb’' 


m 

. nb ' . 


n 


When substituted into the reduced equation lu + Kv = 0, these trial solutions will trans¬ 
form the latter into 



b ,+ l + K 


m 

fi 


// = 0 


* The symbol v here denotes a vector. Do not confuse it with the v in the complex-number notation 
h ± v/, where it represents a scalar. 
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or, after multiplying through by b ' (a scalar) and factoring, 


(b! + K) 


m 

n 


= 0 


(19.8') 


where 0 is a 7ero vector. It is from this homogeneous-equation system that we arc to find 
the appropriate values of/;, m, and n to be used in the trial solutions in order to make the 
latter determinate. 

To avoid trivial solutions for m and n. it is necessary that 

\bI + K\=Q (19.9') 

And this is the characteristic equation which will give us the characteristic roots bj. You can 
verify that if we substitute 


b 0 




0 b 


and 


K = 


6 9 
-1 0 


into this equation, the result will precisely be (19.9), yielding the repeated roots b = -3. 

In general, each root bj will elicit from (19.8') a particular set of infinite number of 
solution values of m and n which are tied to each other by the equation m, = k,n,. It is 
therefore possible to write, for each value of h, . 

n, = Aj and m,=kiA, 

where A,- are arbitrary constants to be definilized later. When substituted into the trial so¬ 
lutions, these expressions for«, and m, along with the values h, will lead to specific forms 
of complementary functions. If all roots are distinct real numbers, wc may apply (18.5) and 
write 


*c 


Em,6, r 


iMiA" 

m > ! c_ 




to T1 


With repeated roots, however, we must apply (18.6) instead and, as a result, the comple¬ 
mentary functions will contain terms with an extra multiplicative f, such as m , b’ + mAh' 
(foray) and whY + mtb 1 (for .vy). The factors of proportionality between in, andn, are to 
be determined by the relationship between the variables x andy as stipulated in the given 
equation system, as illustrated in (19.10) in our numerical example. Finally, in the 
complex-root ease, the complementary functions should be written with (18.10) as their 
prototype. 

Finally, to get the general solution, we can simply form the sum 



_ 

r 

1 

+ 


.j'' - 


L>V. 


V 


Then it remains only to definitize the arbitrary constants A,-. 

The extension of this procedure to the n-cquation system should be self-evident. When 
n is large, however, the characteristic equation—an «th-degree polynomial equation—may 
not be easy to solve quantitatively. In that event, we may again find the Schur theorem to be 
of help in yielding certain qualitative conclusions about the time paths of the variables in 
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the system. All these variables, we recall, are assigned the same base b in the trial solutions, 
so they must end up with the same b] expressions in the complementary functions and 
share the same convergence properties. Thus a single application of the Schur theorem will 
enable us to determine the convergence or divergence of the time path of every variable in 
the system. 

Simultaneous Differential Equations 

The method of solution just described can also be applied to a first-order linear differential- 
equation system. About the only major modification needed is to change the trial solu¬ 
tions to 


which imply that 


x(t)=me rl and y(t) = ne f! 


x'(t) = rme n 


and y'(t) = rne n 


(19.11) 

(19.12) 


In line with our notational convention, the characteristic roots are now denoted by r instead 
of b. 

Suppose that we are given the following equation system; 


x'{t) + 2y'{t)+2x{t) + 5y{t) = n 
y(t) + x(t) + 4y(r) = 61 

First, let us rewrite it in matrix notation as 


(19.13) 


Ju + Mv = g 


(19.13') 


where the matrices are 
J = 


'i 2 

0 1 

u ~ 

Y(0' 

_/0)_ 

i 

'2 5' 

.1 4 J 

V — 

1 1 

j_ i 

g = 

i -1 

ON 

— —J 

1___1 


Note, that, in view of the appearance of the 2/(0 term in the first equation of (19.13), we 
have to use the matrix J in place of the identity matrix /, as in (19.4"). Of course, if J is non¬ 
singular (so that J~' exists), then we can in a sense normalize (19.13') by premultiplying 
every term therein by J ~ 1 , to get 

J~ ] Ju + J~ ] Mv = J~^g or Iu + Kv = d 

(19.13") 


This new format is an exact duplicate of (19.4"), although it must be remembered that the 
vectors u and v have altogether different meanings in the two different contexts. In the en¬ 
suing development, we shall adhere to the Ju -I- Mv = g formulation given in (19.13'). 

To find the particular integrals, let us try constant solutions x(t) = x and y(t) = y — 
which imply that x'(/) = /(0 = 0. If these solutions hold, the vectors u and u will become 


v — 


X 

i 

and u — 

V 



0 


, and (19.13') will reduce to Mv =g. Thus the solution for x 


and y can be written as 
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which you should compare with (19.50 • In numerical terms, our present problem yields the 
following particular integrals: 


X 


'2 5 "i 

-1 

j 77" 


' 4 

3 

5 " 

3 


' 11 ' 


r 



1 4 


L 61 - 


_ 1 

3 

2 

3 . 


61 


15 


Next, let us look for the complementary functions. Using the trial solutions suggested in 
(19.11) and (19.12), the vectors w and v become 


u = 

m 

re n 

and 

V = 

m 


r 




n 


Substitution of these into the reduced equation 


./« + Mv = 0 


yields the result 


J 


re + M 


Sl 


= 0 


or, after multiplying through by the scalar e rl and factoring. 


IrJ + M) 


= 0 


(19.15) 


Vou should compare this with (19.8'). Since our objective is to find nontrivial solutions of 
m and n (so that our trial solutions will also be nontrivial), it is necessary that 

\rJ + M\ = 0 (19.16) 


The analog of {19.9'), this last equation—the characteristic equation of the given equation 
system- will yield the roots r, that we need. Then, we can find the corresponding (non¬ 
trivial) values of nti and 

In our present example, the characteristic equation is 


rJ + M I 


r + 2 2r + 5 
1 /■ +4 


= r 1 + Ar + 3 = 0 


(19.16') 


with roots r\ = -l,^ = -3. Substituting these into (19.15), we get 


1 3 
1 3 


mj 


= 0 


(forr, = -1) 


nn 


n 2 


= 0 


(forr 2 = -3) 


Lt follows that m | - -3n\ and m 2 = ~ n l-. which we may also express as 

Mi = 34] and m 2 = A 2 
ti\ = —A\ n 2 = —Aj 

Now that r h m, , and «, have all been found, the complementary functions can be writ¬ 
ten as the following linear combinations of exponential expressions: 


s* 

1_ 



1 

_1 




[distinct real roots] 
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And the general solution will emerge in the form 

x(f) 

V(0 


X,. 


X 

t 

+ 


}’c 


y 


in our present example, the solution is 


x{t) 


' 3A]e~’+A 2 e~ :,, + 1 ' 

.v( 0. 


m -A\e-‘ - A 2 e~ 31 + 15 m 


Moreover, if wc arc given the initial conditions x(0) = 6 and v(0) = 12. the arbitrary con¬ 
stants can be found to be A \ =1 and A 2 — 2. These will serve to definitizc the preceding 
solution. 

Once more wc may observe that, since the e r,t expressions arc shared by both time paths 
x(?) and y(r), the latter must either both converge or both diverge. The roots being — I and 
—3 in the present ease, both time paths converge to their respective equilibria, namely, 
x = I and y = 15. 

Even though our example consists of a two-equation system only, the method certainly 
extends to the general ^-equation system. When n is large, quantitative solutions may again 
be difficult, but once the characteristic equation is found, a qualitative analysis will always 
be possible by resorting to the Routh theorem. 


Further Comments on the Characteristic Equation 

The term “characteristic equation 11 has now been encountered in three separate contexts: In 
Sec. 11.3, we spoke of the characteristic equation of a matrix; in Secs. 16.1 and 18.1. the 
term was applied to a single linear differential equation and difference equation; now, 
in this section, we have just introduced the characteristic equation of a system of linear 
difference or differential equations. Is there a connection between the three? 

There indeed is. and the connection is a close one. In the first place, given a single 
equation and an equivalent equation system as exemplified by the equation (19.1) and 
the system (19.T), or the equation (19.3) and the system (19.3') their characteristic 
equations must be identical. For illustration, consider the difference equation (19.1), 
y > + 2 H- a 1 v r +1 + ai) ; t = c. We have earlier learned to write its characteristic equation by 
directly transplanting its constant coefficients into a quadratic equation: 

h 2 + (i\h + a 2 =0 


What about the equivalent system (19.T)? Taking that system to be in the form of 


fu + Kv = d. as in (19.4"). wc have the matrix K = 
equation is 


(l{ Ul 

-L 0 


. So the characteristic 


\h! + K\ = 


b + ci] a 2 \ ,2 


-1 


= h L + a } b + a 2 = 0 [by (19.9')] (19.17) 


which is precisely the same as the one obtained from the single equation as was asserted. 
Naturally, the same type of result holds also in the differential-equation framework, the 
only difference being that we would, in accordance with our convention, replace the symbol 
b by the symbol r in the latter framework. 
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It is also possible to link the characteristic equation of a difference- (or differential-) 
equation system to that of a particular square matrix, which we shall call I). Referring to 
the definition in (11.14), but using the symbol b (instead ol>) for the difference-equation 
framework, we can write the characteristic equation of matrix D as follows: 

\D-bl\ = 0 (19.18) 

In general, if we multiply every element of the determinant \ l) -b! \ by -1. the value of 
the determinant will be unchanged if matrix 1) contains an even number of rows (or 
columns), and will change its sign if D contains an odd number of row's. In the present case, 
however, since \ D - b!\ is to be set equal to zero, multiplying every element by -1 will 
not matter, regardless of the dimension of matrix D. But to multiply every element of the 
determinant \D - b!\ by —1 is tantamount to multiplying the matrix (D - hi) by -1 (sec 
Example 6 of Sec. 3.3) before taking its determinant. Thus, (19.18) can be rewritten as 

\h[ - D\ = 0 (19.18') 


When this is equated to (19.17). it becomes clear that ifwc pick the matrix D = - K, then 
its characteristic equation will be identical with that of the system (19.1'), This matrix, 
-K, has a special meaning: If we take the reduced version of the system, hi + Kv = 0, 
and express it in the form of lu = -Kv, or simply u = -Kv. we see that -K is the matrix 


that can transform the vector i, = 
equation. 


•d 


into the vector u - 




in that particular 


Again, the same reasoning can be adapted to the differential-equation system (19.3'). 
However, in the case of a system such as (19.13'), Ju + Mu = g, where—unlike in the sys¬ 
tem (19.3')—the first term is Ju rather than lu. the characteristic equation is in the form 


rj + M\ = 0 [cf. (19.16')1 


For this case, if we wish to find the expression for the matrix D. we must first normalize the 
equation Ju + Mv = g into the form of (19,13”), and then take D = -K = -J A M. 

In sum, given (1) a single difference or differential equation, and (2) an equivalent equa¬ 
tion system, from which we can also obtain (3) an appropriate matrix D. if we try to find the 
characteristic equations of all three of these, the results must be one and the same. 


EXERCISE 19.2 

1. Verify that the difference-equation system (19.4) is equivalent to the single equation 
y (+ 2 + 6y, + i -9y, = 4, which was solved earlier as Example 4 in Sec. 18,1. How do the 
solutions obtained by the two different methods compare? 

2. Show that the characteristic equation of the difference equation (19.2) is identical with 
that of the equivalent system (19.2'), 

3. Solve the following two difference-equation systems: 

(cj) X(_i + x, + 2y ; ^ 24 

y (+ i + 2x,-2y ; = 9 (with x 0 = 10 and yo = 9) 

(b) *i+i = 

*i-i ~/r-i -!/;= $2 


(withxo = 5 and yo = 4) 



Chapter 19 Simultaneous Differential Equations and Difference Equations 603 


4. Solve the following two differential-equation systems: 

(a) x'(t) - x(t)-12y(t) = -60 

y'(0+ *(0+ M0 = 36 [with x(0) = 13 and y(0) = 4] 

(b) x'(f) -2*(f) + 3y(t) = 10 

y'(t) - x(t) + 2y(t) = 9 [with x(0) = 8 and y(0) = 5] 

5. On the basis of the differential-equation system (19.13), find the matrix D whose char¬ 
acteristic equation is identical with that of the system. Check that the characteristic 
equations of the two are indeed die same. 

19.3 Dynamic Input-Output Models _ 

Our first encounter with input-output analysis was concerned with the question: How much 
should be produced in each industry so that the input requirements of all industries, as well 
as the final demand (open system), will be exactly satisfied? The context was static, and the 
problem was to solve a simultaneous-equation system for the equilibrium output levels of 
all industries. When certain additional economic considerations are incorporated into the 
model, the input-output system can take on a dynamic character, and there will then result 
a difference- or differential-equation system of the type discussed in Sec. 19.2. 

Three such dynamizing considerations will be considered here. To keep the exposition 
simple, however, we shall illustrate with two-industry open systems only. Nevertheless, 
since we shall employ matrix notation, the generalization to the n-industry ease should not 
prove difficult, for it can be accomplished simply by duly changing the dimensions of the 
matrices involved. For purposes of such generalization, it will prove advisable to denote the 
variables not by x, and y, but by .vi,, and xi_ A , so that we can extend the notation to x, u 
when needed. You will recall that, in the input-output context, x, represents the output 
(measured in dollars) of the /th. industry; the new subscript t will now add a time dimension 
to it. The input-coefficient symbol a,y will still mean the dollar worth of the /th commodity 
required in the production of a dollar's worth of theyth commodity, and d, will again indi¬ 
cate the final demand for the /th commodity. 

Time Lag in Production 

In a static two-industry open system, the output of industry I should be set at the level of 
demand as follows: 


X] = a\\X] + d\2X2 + d\ 

Now assume a one-period lag in production, so that the amount demanded in period / de¬ 
termines not the current output but the output of period {t + 1). To depict this new situa¬ 
tion, we must modify the preceding equation to the form 

*u+i = <i\]X\.i + a 12 * 2 ,/ + (19,19) 

Similarly, we can write for industry 11: 

- 021*1. ( +022*2./ +<4f 


(19.19) 
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Thus we now have a system of simultaneous difference equations; this constitutes a 
dynamic version of the inpul-output model. 

In matrix notation, the system consists of the equation 

x s ±\ - A,x t = d t (19.20) 


where 


*7-1 — 


* 2,/-1 


*7 = 


Xi.r 

Xj'i 


A = 


Cl\i d\2 
Oj\ fl?2 


d, = 


d\., 

di.i 


Clearly. (19.20) is in the form of (19.4"), with only two exceptions. First, unlike vector u, 
vector .t, + , does not have an identity matrix / as its “coefficient.” However, as explained 
earlier, this really makes no analytical difference. The second, and more substantive, point 
is that the vector d,, with a time subscript, implies that the final-demand vector is being 
viewed as a function of time. Tf this function is nonconstant, a modification will be required 
in the method of finding the particular solutions, although the complementary functions 
will remain unaffected. The following example will illustrate the modified procedure. 


Example 1 


Given the exponential final-demand vector 


d,= 




V 



1 


b l 


(b = a positive scalar) 


find the particular solutions of the dynamic input-output model (19.20). In line with the 
method of undetermined coefficients introduced in Sec. 18.4, we should try solutions of the 
form xy = ft 5' and x 2 ,t = ft0 ( , where ft and ft are undetermined coefficients. That is, we 
should try 




'ft<5 r 


■ft' 

.020'. 


.02. 


(19.21) 


which implies 1 


*t+1 


' 01 * ,+1 " 


' 010 ' 


'& 

o' 

’ft' 

M + \ 


. 02 «_ 

0 

5 

.ft. 


If the indicated trial solutions hold, then the system (19.20) will become 


'8 o' 

'ft' 


’Gil 

012 

'01 " 

s ! = 

*r 

0 8 

.ft. 


.^21 

ft2. 

.02. 


i 


or, on canceling the common scalar multiplier-5 1 ^ 0, 


’<$ -on 

-Oi2 



T 

-021 

b “ 022. 

.02. 


1 


(19.22) 


You will note that the vector 


ft <5 
LftSJ 


can be rewritten in several equivalent forms: 


0i 

LftJ 


) or J 


01 

L 02 J 


or <5 


1 0 


fa 


5 0 


fa 


Lc iJLftJ LO 

We choose the third alternative here because in a subsequent step we shall want to add 


& 0 
0 8 


to 


another 2x2 matrix. The first two alternative forms will entail problems of dimension conformability. 
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Assuming the coefficient matrix on the extreme left to be nonsingular, we can readily find 
ft and ft (by Cramer's rule) to be 


ft 


S - on + Qi2 
A 


and 


ft 


S - Ol 1 + Ql\ 
A 


(19.22') 


where A = (3 - on)(5 - Q 22 ) - 012021 * Since ft and ft are now expressed entirely in the 
known values of the parameters, we only need to insert them into the trial solution (19.21) 
to get the definite expressions for the particular solutions. 

A more general version of the type of final-demand vector discussed here is given in 
Exercise 19.3-1. 

The procedure for finding the complementary functions of (19.20) is no different from 
that presented in Sec. 19.2. Since the homogeneous version of the equation system is 
Xt+\ - Ax t = 0, the characteristic equation should be 


| bl-A 


b-a v 

-021 


"° 12 Uo 

0 — 022 I 


[cf. (19.9')] 


From this we can find the characteristic roots ft and ft and thence proceed to the re¬ 
maining steps of the solution process. 


Excess Demand and Output Adjustment 

The model formulation in (19.20) can also arise from a different economic assumption. 
Consider the situation in which the excess demand for each product always tends to induce 
an output increment equal (0 the excess demand. Since the excess demand for the first prod¬ 
uct in period r amounts to 

tt)lX\j + U12.X2J +l/[j - X\j 
ilcmanticd i>uppJiwi 

the output adjustment (increment) A*],, is to be set exactly equal to that level: 

Axi,,{= Xu-m - -V 1 ,() = a u xu + 012*2,1 -f- du - *i,, 

However, if we add to both sides of this equation, the result will become identical with 
(19.19). Similarly, our output-adjustment assumption will give an equation the same as 
(19.19') for the second industry. In short, the same mathematical model can result from 
altogether different economic assumptions. 

So far, the input-output system has been viewed only in the discrete-time framework. 
For comparison purposes, let us now cast the output-adjustment process in the continuous¬ 
time mold. 

In the main, this would call for use of the symbol *;(r) in lieu of f , and of the deriva¬ 
tive x j(t) in lieu of the difference A x ir! . With these changes, our output-adjustment 
assumption will manifest itself in the following pair of differential equations: 

x[(t) = a u x } (t) + a ] 2 x 2 (t) + d ] (t) --T,(f) 

* 2(0 = O 21 * l (0 + 022 * 2(0 + <* 2(0 - * 2(0 

At any instant of time t = hj, the symbol v,(fo) tells us the rate of output flow per unit of 
time (say, per month) that prevails at the said instant, and c/, (ro) indicates the final demand 
per month prevailing at that instant. Hence the right-hand sum in each equation indicates 
the rate of excess demand per month, measured at t = to. The derivative x\Uo) at the left, 
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on the other hand, represents the rate of output adjustment per month called forth by the ex¬ 
cess demand at t = to. This adjustment will eradicate the excess demand (and bring about 
equilibrium) in a month’s time, but only if both the excess demand and the output adjust¬ 
ment stay unchanged at the current rates. In actuality, the excess demand will vary with 
time, as will the induced output adjustment, thus resulting in a cat-and-mouse game of 
chase. The solution of the system, consisting of the time paths of the output , supplies a 
chronicle of this chase. If the solution is convergent, the eat (output adjustment) will even¬ 
tually be able to catch the mouse (excess demand), asymptotically (as / -> oo). 

After proper rearrangement, this system of differential equations can be written in the 
format of (19.13 ) as follows: 

lx' + (l-A)x^d (19.23) 


where 


'*K0' 

| 


A — 

an ai2 

A — 


AiO. 

X = 


ft — 

an 

u — 



(the prime denoting derivative, not transpose). The complementary functions can be found 
by the method discussed earlier. In particular, the characteristic roots are to be found from 
the equation 


\rl + (i-A)\ = 


r + l-flji 
-«21 


-a u 

r + 1 — uii 


0 [ef. (19.16)] 


As for the particular integrals, if the final-demand vector contains nonconstant functions 
of time d\(t) and ^( 0 as its elements, a modification will be needed in the method of 
solution. Let us illustrate with a simple example. 


Example 2 


Given the final-demand vector 


d = 


'he 1 ’ 1 ' 


Al 



m ^2 m 


where kj and p are constants, find the particular integrals of the dynamic model (19.23). 
Using the method of undetermined coefficients, we can try solutions of the form 
x;(t) = foe pt , which imply, of course, that x/(t) = pfye 1 ’ 1 . In matrix notation, these can be 
written as 


x — 



(19.24) 


and 


x = p 


h 

h 


e fi, = 


P 0 
0 p 



[cf. footnote in Example 1] 


Upon substituting into (19.23) and canceling the common (nonzero) scalar multiplier e 1 ' 1 , 
we obtain 


'p o' 


.0 P. 

.fe. 


'1 - On 

- fl| 2 

'01 ' 

—021 

1 “ 022 . 

. 02 . 


XI 

>•2 


or 


*/ J + 1 - an 

-012 

'ft' 


« 

A1 

-021 

p -hi - 022. 

.02. 


*i 

A 2 


(19.25) 
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If the leftmost matrix is nonsingular, we can apply Cramer's rule and determine the values 
of the coefficients fa- to be 


fh = 


h{P + 1 ~ tin) + ^-2«l2 
A 


$2 — 


h(P + 1 


(/|l) + A]g;i 

A 


(19.25') 


where A = (p + 1 — u\\)(p 1 - a 22 ) - a\ 2 a 2 i- The undetermined coefficients having 

thus been determined, we can introduce these values into the trial solution (19.24) to obtain 
the desired particular integrals. 


Capital Formation 

Another economic consideration that can give rise to a dynamic input-output system is cap¬ 
ital formation, including the accumulation of inventory. 

In the static discussion, we only considered the output level of each product needed to 
satisfy current demand. The needs for inventory accumulation or capital formation were 
either ignored, or subsumed under the final-demand vector, To bring capital formation 
into the open, let us now consider—along with an input-coefficient matrix A = a 
capital-coefficient matrix 


C = [c tl ] 


Cl 1 C|2 

C'2| C'22 


where Cjj denotes the dollar worth of the ;th commodity needed by the yth industry as new 
capital (either equipment or inventory, depending on the nature of the ith commodity) as a 
result of an output increment of $1 in the/th industry. For example, if an increase of $1 in 
the output of the soft-drink (yth) industry induces it to add $2 worth of bottling equipment 
(/th commodity), then c ; / = 2. Such a capital coefficient thus reveals a marginal capital- 
output ratio of sorts, the ratio being limited to one type of capital (the /th commodity) only. 
Like the input coefficients a, ; , the capital coefficients are assumed to be fixed. The idea is 
for the eeonomy to produce each commodity in such quantity as to satisfy not only the 
input-requirement demand plus the final demand, but also the capital-requirement demand 
for it. 

If time is continuous, output increment is indicated by the derivatives x ( '(f); thus the 
output of each industry should be set at 

*i(0 = tfnxi(f) + <*12*2(0 + <-'U*|(0 + *12*2(0 + ^i( f ) 

X 2 U) = giiXi(t) + £122X2(0 +C'2ixj(0 + £.' 22 * 2(0 + <0(0 

input requirement capital requirement Anal demand 

In matrix notation, this is expressible by the equation 

lx - Ax + Cx + d 


or 


Cx’ + (A - I)x = -d 


( 19 . 26 ) 
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If time is discrete, the capital requirement in period / will be based on the output incre¬ 
ment x LI - (= A Xjj i); thus the output levels should be set at 


■V|., 

X2.1 


flu 

on 

X\A 

q 2 \ 

011 _ 

^ 2 J m 


inpu* recfJirciTum 


1 _ 


j- 

1 _ 

[<•21 < ; 22 J 


Xi.; - 


V 

ciipiitil requiremem 


d\ j 
dij _ 

(in;tl demand 


or /.v, = A\i C(x, — x,_|) — </; 

By shilling the time subscripts forward one period, and collecting terms, however, we ean 
write the equation in the form 

(I — A — C-M'f+i -f (-V, — f/,+] (19.27) 

The di flerential-equation system (19.26) and the difference-equation system (19.27) can 
again be solved, of course, by the method of See. 19.2. It also goes without saying that these 
two matrix equations are both extendible to the ^-industry case simply by an appropriate 
redefinition of the matrices and a corresponding change in the dimensions thereof. 

In the preceding, we have discussed how a dynamic input-output model ean arise from 
such considerations as time lags and adjustment mechanisms. When similar considerations 
arc applied to general-equilibrium market models, the tatter will tend to become dynamic 
in much the same way. But, since the formulation of such models is analogous in spirit to 
input-output models, we shall dispense w ith a formal discussion thereof and merely refer 
you to the illustrative cases in Exercises 19.3-6 and 19.3-7. 


;. 2 <5 r 


what will the 


EXERCISE 19.5 

1. In Example 1, if the final-demand vector is changed to d r 

particular solutions be? After finding your answers, show that the answers in Example 1 
are merely a special case of these, with ki = h = 1. 

2. (a) Show that (19.22) can be written more concisely as 

(SI -A)p = u 

(b) Of the five symbols used, which are scalars? Vectors? Matrices? 

(c) Write the solution for fl in matrix form, assuming (SI - A) to be nonsingular. 

3. (a) Show that (19.25) can be written more concisely as 

(pt + l-A)0 = k 

( 1 b ) Which of the five symbols represent scalars, vectors, and matrices, respectively? 
(c) Write the solution for ft in matrix form, assuming (/;/ + / - A) to be nonsrngular. 


A. Given A = 


" 3 

4 " 


r^fi 

10 

1C 

and d< = 

V10 / 

3 

- 10 

2 

10 _ 

[m\ 


for the discrete-time production-lag input- 


output model described in ( 19 . 20 ), find (a) the particular solutions; (b) the comple¬ 
mentary functions; and (c) the definite time paths, assuming initial outputs *1,0 = ^ 


and ><2,0 = ^. (Use fractions, not decimals, in all calculations.) 
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‘ i 

4 


' e r /10 ' 

5. Given A = 

10 

3 

- io 

10 

2 

10 _ 

and d = 

2e t/: \ 


for the continuous-time output-adjustment 


input-output model described in (19.23), find (a) the particular integrals; (b) the com¬ 
plementary functions; and (c) the definite time paths, assuming initial conditions 
*1 (0) = ^ and *2(0) = (Use fractions, not decimals, in all calculations.) 

6 . In an ^-commodity market, all Q<* and Qu (with / = 1,2,r?) can be considered 
as functions of the n prices P r , and so can the excess demand for each 

commodity E, = Q^, - Q v . Assuming linearity, we can write 

£l = Oio + On Pi +012^2 H- 


£ 2 = 020 + 021 Pi H- 022 P 2 H-+ C>2r: P^ 


to — QpQ f Qri\ P\ * 8*2^2 +-i QnnP/i 

or, in matrix notation, 

f = a+ AP 

(o) What do these last four symbols stand for—scalars, vectors, or matrices? What are 
their respective dimensions? 

(i b) Consider all prices to be functions of time, and assume that dP\jdt - a/E, (/ - 1, 
2,.. n). What is the economic interpretation of this last set of equations? 

(c) Write out the differential equations showing each dPi/dt to be a linear function of 
the n prices. 

(d) Show that, if we let P' denote the n x 1 column vector of the derivatives dPj/dt, 
and if we let denote an n x n diagonal matrix, withai,« 2 /..cr» (in that order) 
in the principal diagonal and zeros elsewhere, we can write the preceding differential- 
equation system in matrix notation as P 1 -■■ a AP = a a. 

7. For the ^-commodity market of Prob. 6, the discrete-time version would consist of a set 

of difference equations A/\* = t*,*£ f , f 0 = 1,2,n), where E lit = o f *o + °n P\,t + 

QaPjj H— + 

(a) Write out the excess-demand equation system, and show that it can be expressed 
in matrix notation as E, = a + AP r . 

(b) Show that the price adjustment equations can be written as P^t - P f =c/£<, 
where a is the n x rt diagonal matrix defined in Prob. 6, 

(c) Show that the difference-equation system of the present discrete-time model can 
be expressed in the form P f +i - (I +<xA)Pi=<xq. 


19.4 The Inflation-Unemployment 
Model Once More 


Having illustrated the multisector type of dynamic systems with input-output models. \vc 
shall now provide an economic example of simultaneous dynamic equations in the one- 
sector setting. For this purpose, the inflation-unemployment model, already encountered 
twiee before in two different guises, can he called back into service once again. 
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Simultaneous Differential Equations 

In Sec. 16.5 the inflation-unemployment model was presented in the continuous-time 
framework via the following three equations: 


p = a - T - l$U + gn 

(a, P > 0; 0 < g < 1) 

(16.33) 

[ 

II 

VI 

V 

(16.34) 

dV u 

- = -k(p-p) 
dt 

<*>0) 

(16.35) 


except that we have adopted the Greek letter p here to replace m in (16.35) in order to avoid 
confusion with ourearlierusageofthe symbol min the methodological discussion of Sec. 19.2. 
In the treatment of this model in Sec. 16.5, since we were not yet equipped then to deal with 
simultaneous dynamic equations, we approached the problem by condensing the model into a 
single equation in one variable. That necessitated a quite laborious process of substitutions and 
eliminations. Now, in view of the coexistence of two given patterns of change in the model for 
n and U, we shall treat the model as one of two simultaneous differential equations. 

When (16.33) is substituted into the other two equations, and the derivatives dnjdt = 
jr'(r) and dU/dt = U'U) written more simply as n' and U 1 , the model assumes the form 

tt' +jV-g)n+jpU = j(a-T) (1928) 

U' - kgn + kfiU = k{a - T - p) 


or, in matrix notation, 

! ol|V 

0 1 U ! 


./O ~K) iP 

-kg kp 


j{& ~ T) 
k(a - T - p) 


(19.28') 


J M 

From this system, the time paths of n and U can be found simultaneously. Then, if desired, 
we can derive the p path by using (16.33). 


Solution Paths 

To find the particular integrals, we can simply set jt' = U' = 0 (to make rt and U station¬ 
ary over time) in (19.28') and solve for n and U. In our earlier discussion, in (19.14), such 
solutions were obtained through matrix inversion, but Cramer's rule can. certainly be used, 
too. Either way, we can find that 

n=p and J = - [a — T - (I - g)/i] (19.29) 

P 

The result that n = p (the equilibrium expected rate of inflation equals the rate of mone¬ 
tary expansion) coincides with that reached in Sec. 16.5. As to the rate of unemployment 
U, we made no attempt to find its equilibrium level in that section. If we did (on the basis 
of the differential equation in J given in Exercise 16.5-2), however, the answer would be no 
different from the U solution in (19.29). 

Turning to the complementary functions, which are based on the trial solutions me" and 
Me", we can determine m, n, and r from the reduced matrix equation 


(rJ + M) 


m 

n 


= 0 [from (19.15)] 
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Example 1 


which, in the present context, takes the form 


' r + A 1 -fi) 

Jli 

m 


'o' 

-kg 

r+kp. 

n 


.0. 


(19.30) 


To avoid trivial solutions for m and n from this homogeneous system, the determinant of 
the coefficient matrix must be made to vanish; that is, we require 

| rJ +M\= r 2 + [kfl + j( 1 - g)]r + kpj = 0 (19,31) 

This quadratic equation, a specific version of the characteristic equation r 2 +ajr + 
Gj = 0, has coefficients 


«i ~kp+ j( 1 - g) and aj = kfij 

And these, as we would expect, arc precisely the a-\ and values in (16.37")—a single¬ 
equation version of the present model in the variable n. As a result, the previous analysis 
of the three cases of characteristic roots should apply here with equal validity. Among other 
conclusions, we may recall that, regardless of whether the roots happen to be real or com¬ 
plex, the real part of each root in the present model turns out to be always negative. Thus 
the solution paths arc always convergent. 


Find the time paths of ,t and U, given the parameter values 

a-T = jr /* = 3 g = 1 / = | and fc = ^ 

Since these parameter values duplicate those in Example 1 in Sec. 16.5, the results of the 
present analysis can be readily checked against those of the said section. 

First, it is easy to determine that the particular integrals are 

if = * and (7= 1 ( 1 ) = [by (19.29)] (19.32) 

The characteristic equation being 

(2+ \ l+ \ =0 [ b y (1931}] 

the two roots turn out to be complex; 

'i > r * = \ ± j \ ~ ^) = -\ ± \ } ( with h = and * = |) < 19 - 33 ) 


Substitution of the two roots (along with the parameter values) into (19.30) yields, respec¬ 
tively, the matrix equations 



9 

4 


mi 


0 

i 

~2 

|o+o_ 


"l 


0 


r 3,, 9 





-- 0+0 - 

4 4 


m 2 


0 

. -1 


m 

. J 


0 


3 3/ 
from r\ = -- + w 

4 4 


(19.34) 


, 3 3 • 

from ri =-/ 

4 4 


(19.340 
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Since ri and r 2 are designed—via (19.31)—to make the coefficient matrix singular, each of 
the preceding two matrix equations actually contains only one independent equation, 
which can determine only a proportionality relation between the arbitrary constants m, and 
n,. Specifically, we have 


2 

3 


(1 -i)m i =n i 


and 


1 

3 


(1 +i)m 2 = n 2 


The complementary functions can, accordingly, be expressed as 




'mle r,t 4-m 2 e r2, ’ 

Uc m 

1 

n 1 e 1 ' 1 ' 4 -n 2 e f2f 


= e 


hi 


m ] e ult + m 2 e- vit 
n-\e v,t + n 2 e v,( 


[by (16.11)] 


= e 


hi 


(mi 4- m 2 ) cos vt + (mi - m 2 )i sin vf 
(ni 4-n 2 )cosvt4-(ni — rt 2 )/ sin 


[by (16.24)] 


If, for notational simplicity, we define new arbitrary constants 


As = mi 4 -and A$ = (mi - m 2 )f 


it then follows that 1 


ni +n 2 = j(A 5 - A 6 ) 


(m - n 2 )i = - 3 -(A 5 + A 6 ) 


So, using these, and incorporating the h and v values of (19.33) into the complementary 
functions, we end up with 


rr c 



As cos-f 

4 


^6 sin -1 
4 


1 3 1 3 

-(A 5 - A 6 ) cos -t + |(A 5 4- A 6 )sin -t 


(19.35) 


Finally, by combining the particular integrals in (19.32) with the above complementary 
functions, we can obtain the solution paths of n and U. As may be expected, these paths 
are exactly the same as those in (16.43) and (16.45) in Sec. 16.5. 


Simultaneous Difference Equations 

The simultaneous-equation treatment of the inflation-unemployment model in discrete 
time is similar in spirit to the preceding continuous-time discussion. We shall thus merely 
give the highlights. 


1 This can be seen from the following: 


ni +n 2 = ^(1 - i)mi + ^(1 + i)m 2 = ^[(mi + m 2 )- (mi m 2 )i] 

1 


= 3(A5-^) 


>1 -n 2 )i = 


^(1 -i)m 1 - ^(1 +i)m 2 


i - 3 '[( m i - m 2 ) - (mi +m 2 )i]i 


= ^(A6 4Ai) [/ 2 = -l 
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The model in question, as given in Sec. 18.3, consists of three equations, two of which 
describe the patterns of change of n and V, respectively: 

p^a-T-fSU.+gx, ( 18 . 18 ) 

JTf-H -JT, =j(p, -n,) (18.19) 

U l+l -U,=-k{n-p n 0 ( 18 . 20 ) 


Eliminating p, and collecting terms, wc can rewrite the model as the difference-equation 
system 


1 o' 

JT/-1 

_L 

r~(L -./+./#) ./id' 

*7 

-kg l+Pk_ 

V,-\_ 

T 

■ 0 -l 

. U <. 


J K 


y(« - t ) 

k(c*~ T-fi) 


( 19 . 36 ) 


Solution Paths 

If stationary equilibriums exist, the particular solutions of (19.36) can be expressed as 
W = n r = x r +\ and U = U t = U r+ \ . Substituting Jr and U into (19.36). and solving the 
system (by matrix inversion or Cramer s rule), wc obtain 

n = ii and JJ =Ua-T -g)/z] ( 19 . 37 ) 

_ P 

The U value is the same as what was found in See. 18.3. Although wc did not find W in the 
latter section, the information in Exercise 18.3-2 indicates that W = /i, which agrees with 
(19.37). In fact, you may note, the results in (19.37) arc also identical with the intertempo¬ 
ral equilibrium values obtained in the continuous-time framework in (19.29). 

The search for the complementary functions, based this time on the trial solutions mb* 
and nb l , involves the reduced matrix equation 


(bJ+K) 


m 

n 


= 0 


or, in view of (19.36). 


'b -(1 -j+jg) 

jP 

m 


ro' 

-bkg 

b(]+f)k)-] m 

n 


0 


In order to avoid trivial solutions from this homogeneous system, we require 

\bJ + K\=(\+pk)b 1 -[l+gj+(l-j)(l +pk)]b 

+ ( 1 -;+./£) = 0 ( 19 . 39 ) 

The normalized version of this quadratic equation is the characteristic equation b 2 + 
a\b + = 0, with the same ci\ and ai eocllicients as in (18.24) and (18.33) in Sec. 18.3. 

Consequently, the analysis of the three cases of characteristic roots undertaken in that 
section should equally apply here. 

For each root, 6,, (19.38) supplies us with a specific proportionality relation between 
the arbitrary constants m { and . and these enable us to link the arbitrary constants in the 
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complementary function for U to those in the complementary function for ,t. Then, by 
combining the complementary functions and the particular solutions, we can get the time 
paths of 7T and U. 


EXERCISE 19.4 

1. Verify (19.29) by using Cramer's rule. 

2. Verify that the same proportionality relation between mi and m emerges whether we 
use the first or the second equation in the system (19.34). 

3. Find the time paths (general solutions) of n and U, given; 

U' = -\{V-P) 

4. Find the time paths (general solutions) of tt and U, given: 

(o) pr = | — lU t + (ft) ft « ^ - 4U r + nt 

rtt-l - X { = ^(ft ~ *t) *t-1 -7T t = ^(p t - *<) 

Ut+\ — Ut = —(^ - ft+i) Um-i — Ut = “* ft-i) 

19,5 Two-Variable Phase Diagrams __ 

The preceding sections have dealt with the quantitative solutions of linear dynamic systems. 
In the present section, wc shall discuss the qualitative-graphic (phase-diagram) analysis of 
a nonlinear differential-equation system. More specifically, our attention will be focused on 
the first-order differential-equation system in two variables, in the general form of 

a-'(M = fix, v) 

y’(t) =g(x. y) 

Note that the time derivatives x'(() and/(f) depend only on x andy and that the variable t 
does not enter into the / and g functions as a separate argument. This feature, which makes 
the system an autonomous system, is a prerequisite for the application ol llie phase-diagram 
technique.' 

The two-variable phase diagram, like the one-variable version in Sec. 15.6. is limited in 
that it can answer only qualitative questions—those concerning the location and the 
dynamic stability of the intertemporal equilibrium(s). But, again like the one-variable 
version, it has the compensating advantages of being able to handle nonlinear systems as 
comfortably as linear ones and to address problems couched in terms of general functions 
as readily as those in terms of specific ones. 

f In the one-variable phase diagram introduced earlier in Sec. 15.6, the equation dy/dt = f(y) is also 
restricted to be autonomous, being forbidden to have the variable f as an explicit argument in the 
function f. 
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The Phase Space 

When constructing the one-variable phase diagram (Fig. 15.3) lor the (auto no mo us) differ¬ 
ential equation dy felt = /(v), we simply plotted dyfdt against y on the two axes in a two- 
dimensional phase space. Now that the number of variables is doubled r however, how can 
we manage to meet the apparent need for more axes? The answer, fortunately, is that the 
2-space is all we need. 

To see why this is feasible, observe that the most crucial task of phase-diagram con¬ 
struction is to determine the direction of movement of the variable(s) over time. It is this in¬ 
formation, as embodied in the arrowheads in Fig. 15.3. that enables us to derive the final 
qualitative inferences. For the drawing of the said arrowheads, only two things are required: 
(I) a demarcation line—call it the "dyfdt = 0" line—that provides the locale for any 
prospective equilibrium(s) and. more importantly, separates the phase space into two re¬ 
gions, one characterized by dyfdt > 0 and the other by dyfdt < 0 and (2) a real line on 
which the increases and decreases of v that are implied by any nonzero values of dyfdt can 
be indicated. In Fig. 15.3. the demarcation line cited in item I is found in the horizontal 
axis. But that axis actually also serves as the real line cited in item 2. '['his means that the 
vertical axis, for dyfdt , can actually be given up without loss, provided we take care to 
distinguish between the dyfdt > 0 region and the i dyfdt < 0 region—say. by labeling the 
former with a plus sign, and the latter with a minus sign. This dispensability of one axis is 
what makes feasible the plaeemenl of a two-variable phase diagram in the 2-space. We now 
need two real lines instead of one. But this is automatically taken care of by the standard* 
and y axes of a two-dimensional diagram. We now also need two demarcation lines 
(or curves), one for dxfdt = 0 and the other for dyfdt = 0. But these are both graphable 
in a two-dimensional phase space. And once these are drawn, it would noi be difficult to 
decide which sides of these lines or curves should be marked with plus and minus signs, 
respectively. 


The Demarcation Curves 

Given the following autonomous differential-equation system 

m.,) 

/ = *(*,>*) 


(19.40) 


where x' and y are short for the time derivatives .t'(r) and y'U), respectively, the two 
demarcation curves—to be denoted by x' — 0 and v' — 0—represent the graphs of the 


two equations 

/U.y-)=0 

[/ = 0 curve] 

(19.41) 


£(*,;■■) = o 

[ y' = 0 curve] 

(19.42) 


If the specific form of the/function is known, (19.41 (can be solved for> in terms of a-and 
the solution plotted in the xy plane as the x' = 0 curve. Even if not, however, we can 
nonetheless resort to the implicit-function rule and ascertain the slope of the x' = 0 curve 
to be 


df/0x f x 

;)//;)}■ f. 


<h_ 

dx 


r =0 


(,/j ± 0 ) 


(19.43) 
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FIGURE 19.1 > 


0 < 

As long as die signs of the partial derivatives f x and j y (# 0) are known, a qualitative elue 
to the slope of the x' = 0 curve is available from (19.43). By the same token, the slope of 
the y' = 0 curve can be inferred from the derivative 

d -f =-- (£j. / 0) (19.44) 

dx y ,0 Xy 

For a more concrete illustration, let us assume that 

fx < 0 /,- > o gx> 0 and g v < 0 (19.45) 

Then both the x' = (J and y' = 0 curves will be positively sloped. If we further assume that 
j Kx 

— — > - — [x 1 — 0 curve steeper than y = 0 curve] 

fy Sy 

then we may encounter a situation such as that shown in Fig. 19.1. Note that the demarca¬ 
tion lines are now possibly curved. Note, also, that they are now no longer required to 
coincide with the axes. 

The two demarcation curves, intersecting at point E, divide the phase space into four 
distinct regions, labeled I through IV Point E, where x and y are both stationary 
(x 1 = y' = 0), represents the intertemporal equilibrium of the system, At any other point, 
however, either x ory(or both) would be changing over time, in directions dictated by the 
signs of the time derivatives .v' and y' at that point. In the present instance, we happen to 
have x 1 > 0 (x‘ < 0) to the left (right) of the x' = 0 curve; hence the plus (minus) signs on 
the left (right) of that curve. These signs are based on the fact that 

‘Av f 

— = L <0 [by (19.40) and (19.45)] (19.46) 

dx 

which implies that, as we move continually from west to east in the phase space (as x in¬ 
creases), x' undergoes a steady decrease, so that the sign of x' must pass through three 
stages, in the order +, 0, -.Analogously, the derivative 

9y' 

— =£ v . < 0 [by (19.40) and (19.45)] 

3v 



(19.47) 
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implies that, as we move continually from south to north (as v increases), y' steadily 
decreases, so that the sign of y must pass through three stages, in the order -f, 0, Thus 
we are led to append the plus signs below, and the minus signs above, the y 1 = 0 curve in 
Fig. 19.1. 

On the basis of these plus and minus signs, a set of directional arrows can now be drawn 
to indicate the intertemporal movement of x and y. For any point in region I, x' and y' are 
both negative. Hence x and y must both decrease over time, producing a westward move¬ 
ment for x, and a southward movement for y. As indicated by the two arrows in region I. 
given an initial point located in region I, the intertemporal movement must be in the gen¬ 
eral southwestward direction. The exact opposite is true in region III, where x' and v' are 
both positive, so that both the x andy variables must increase over time, in contrast, x' and 
y have different signs in region II. With x' positive and y' negative, x should move east¬ 
ward and y southward. And region IV displays a tendency exactly opposite to region II. 

Streamlines 

For a better grasp of the implications of the directional arrows, we can sketch a series of 
streamlines in the phase diagram. Also referred to as phase trajectories (or trajectories for 
short) or phase paths , these streamlines serve to map out the dynamic movement of the 
system from any conceivable initial point. A few of these are illustrated in Fig. 19.2, which 
reproduces the x = 0 and y = 0 curves in Fig 19.1. Since every point in the phase space 
must be located on one streamline or another, there should exist an infinite number of 
streamlines, all of which conform to the directional requirements imposed by th txy arrows 
in every region. For depicting the general qualitative character of the phase diagram, how>- 
ever, a few representative streamlines should normally suffice. 

Several features may be noted about the streamlines in Fig. 19.2. First, all of them hap¬ 
pen to lead toward point /f. This makes E a stable (here, globally stable) intertemporal equi¬ 
librium. Later, we shall encounter other types of streamline configurations. Second, while 
some streamlines never venture beyond a single region (such as the one passing through 
point A), others may cross over from one region into another (such as those passing through 
B and C). Third* where a streamline crosses over, it must have either an infinite slope 
(crossing the r = 0 curve) or a zero slope (crossing the y = 0 curve). This is duo to the 
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fact that, along the x = 0 (/ = 0) curve, x(y) is stationary over time, so the streamline 
must not have any horizontal (vertical) movement while crossing that curve. To ensure that 
these slope requirements are consistently met. it would be advisable, as soon as the demar¬ 
cation curves have been put in place, to add a few short vertical sketching bars across the 
x" = 0 curve and a few horizontal ones across the y' = 0 curve, as guidelines for the draw¬ 
ing of the streamlines. 1 Fourth, and last, although the streamlines do explicitly point out the 
directions of movement of jc andy over time, they provide no specific information regard¬ 
ing velocity and acceleration, because the phase diagram does not allow for an axis for t 
(time). It is fOT this reason, of course, that streamlines carry the alternative name of phase 
paths, as opposed to time paths. The only observation we can make about velocity is qual¬ 
itative in nature: As we move along a streamline closer and closer to the x' = 0 (/ = 0) 
curve, the velocity of approach in the horizontal (vertical) direction must progressively 
diminish. This is due to the steady decrease in the absolute value of the derivative 
x' = dx/dt(y' = dyjdt) that occurs as we move toward the demarcation line on which 
*'(/') takes a zero value. 

Types of Equilibrium 

Depending on the configurations of the streamlines surrounding a particular intertemporal 
equilibrium, that equilibrium may fall into one of four categories: (1) nodes, (2) saddle 
points, (3) foci or focuses, and (4) vortices or vortexes. 

A node is an equilibrium such that all the streamlines associated with it either flow non- 
cyclically toward it ( stable node ) or flow noncyclically away from it ( unstable node). We 
have already encountered a stable node in Fig. 19.2. An unstable node is shown in 
Fig. 19.3a. Note that in this particular illustration, it happens that the streamlines never 
cross over from region to region. Also, the x' - 0 and y' = 0 curves happen to be linear, 
and, in fact, they themselves serve as streamlines. 

A saddle point is an equilibrium with a double personality—it is stable in some direc¬ 
tions, but unstable in others. More accurately, with reference to the illustration in 
Fig. 19.3 b, a saddle point has exactly one pair of streamlines -called the stable branches 
of the saddle point—that flow directly and consistently toward the equilibrium, and exactly 
one pair of streamlines—the unstable branches—that flow directly and consistently away 
from it. All the other trajectories head toward the saddle point initially but sooner or later 
turn away from it. This double personality, of course, is what inspired the name “saddle 
point.” Since stability is observed only on the stable branches, which are not reachable as a 
matter of course, a saddle point is generically classified as an unstable equilibrium. 

The third type of equilibrium, focus, is one characterized by whirling trajectories, all of 
which either flow cyclically toward it (stable focus), or flow cyclically away from it (unsta¬ 
ble focus). Figure 19.3c illustrates a stable focus, with only one streamline explicitly drawn 
in order to avoid clutter What causes the whirling motion to occur? The answer lies in the 
way the x' = 0 and / = 0 curves are positioned. In Fig. 19.3c, the two demarcation curves 
are sloped in such a way that they take turns in blockading the streamline flowing in a di¬ 
rection prescribed by a particular set of xy arrows. As a result, the streamline is frequently 
compelled to cross over from one region into another, tracing out a spiral. Whether we get 


'To aid your memory, note that the sketching bars across the x' = 0 curve should be perpendicular to 
the x axis. Similarly, the sketching bars across the y' = 0 curve should be perpendicular to the y axis. 
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FIGURE 19.3 






a stable focus (as is the case here) or an unstable one depends on the relative placement of 
the two demarcation curves. But in either case, the slope of the streamline at the crossover 
points must still be either infinite (crossing x' = 0) or zero (crossing y = 0). 

Finally, we may have a vortex (or center). This is again an equilibrium with whirling 
streamlines, but these streamlines now form a family of loops (concentric circles or ovals) 
orbiting around the equilibrium in a perpetual motion. An example of this is given in 
Fig. 19.3i/, where, again, only a single streamline is shown. Inasmuch as this type of 
equilibrium is unattainable from any initial position away from point E , a vortex is auto¬ 
matically classified as an unstable equilibrium. 

All the illustrations in Fig. 19.3 display a unique equilibrium. When sufficient nonlin¬ 
earity exists, however, the two demarcation curves may intersect more than once, thereby 
producing multiple equilibria. In that event, a combination of the previously cited types of 
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intertemporal equilibrium may exist in the same phase diagram. Although there will then 
be more than four regions to contend with, the underlying principle of phase-diagram 
analysis will remain basically the same. 


Inflation and Monetary Rule a la Obst 

As an economic illustration of the two-variable phase diagram, we shall present a model 
due to Professor Obst, f which purports to show the ineffectiveness of the conventional 
(hence the need for a new) type of countercyclical monetary-policy rule, when an “inflation 
adjustment mechanism" is at work. Such a model contrasts with our earlier discussion of 
inflation in that, instead of studying the implications of & given rate of monetary expansion, 
it looks further into the efficacy of two different monetary rules, each prescribing a differ¬ 
ent set of monetary actions to be pursued in the face of various inflationary conditions. 

A crucial assumption of the model is the inflation adjustment mechanism 


dp 

cl! 


'^k)= h 

M s J 


Mi 

M s 


(h > 0) (19.48) 


which shows that the effect of an excess supply of money {M s > M ti ) is to raise the rate of 
inflation p. rather than the price level P. The clearance of the money market would thus 
imply not price stability, but only a stable rate of inflation. To facilitate the analysis, the 
second equality in (19.48) servos to shift the focus from the excess supply of money to the 
demand-supply ratio of money, Mj/M„ which we shall denote by /r. On the assumption 
that M,i is directly proportional to the nominal national product PQ , we can write 


M d 



aPQ 

~m7 


(a> 0 ) 


The rates of growth of the several variables are then related by 


dp/dt dajdt dPjdl dQjdt dMJdt 

It ~ a + ~P~ + —Q 

[by (10.24) and (10.25)] 

= p + q-m [a = a constant] (19.49) 


where the lowercase letters p, q, and m denote, respectively, the rate of inflation, the (exoge¬ 
nous) rate of growth of the real national product, and the rate of monetary expansion. 

Equations (19.48) and (19.49), a set of two differential equations, can jointly determine 
the time paths of p and /u, if, for the time being, m is taken to be exogenous. Using the sym¬ 
bols p' and p! to represent the time derivatives p'(t) and p'(t), we can express this system 
more concisely as 


P=H 1-M) 
p' = (p + (j-m)n 


(19.50) 


f Norman P. Obst, "Stabilization Policy with an Inflation Adjustment Mechanism," Quarterly journal of 
Economics, May 1978, pp. 355-359. No phase diagrams are given in the Obst paper, but they can be 
readily constructed from the model. 



Chapter 1 9 Simultaneous Differential Equations and Diffenwc Equations 621 


FIGURE 19.4 




(a) \h) 


Given that h is positive, we can have p = 0 if and only ill — /< =0. Similarly, since n is 
always positive, \i — 0 if and only if p + q - m = 0. Thus the p = 0 and \x = 0 demar¬ 
cation curves arc associated with the equations 

H = \ [// = 0 curve] (19.51) 

p = m -q [p = 0 curve] (19.52) 


As shown in Fig. 19.4a. these plot as a horizontal line and a vertical line, respectively, and 
yield a unique equilibrium at £, The equilibrium value JI = I means that in equilibrium M,i 
and M s are equal, clearing the money market. The fact that the equilibrium rate of inflation 
is shown to be positive reflects an implicit assumption that m > q. 

Since the p’ = 0 curve corresponds to the x' = 0 curve in our previous discussion, it 
should have vertical sketching bars. And the other curve should have horizontal ones. From 
(19.50), we find that 


— = —h < 0 and 
djx 


v 

dp 


= H>Q 


(19.53) 


with the implication that a northward movement across the // = 0 curve passes through the 
(+, 0, -) sequence of signs for p\ and an eastward movement across the \i = 0 curve, the 
(—, 0, +) sequence of signs for p !. Thus wc obtain the four sets of directional arrows as 
drawn, which generate streamlines (only one oI which is shown) that orbit counterclockwise 
around point E. This, of course, makes E a vortex. Unless the economy happens initially to 
be at E , it is impossible to attain equilibrium. Instead, (here will be never-eeasing fluctuation. 

The preceding conclusion is, however, the consequence of an exogenous rate of mone¬ 
tary expansion. What if wc now' endogenize m by adopting an anti-inflationary monetary 
rule? The “conventional” monetary rule would call for gearing the rate of monetary expan¬ 
sion negatively to the rate of inflation: 


m = m{p) 


m*(p) < 0 [conventional monetary rule] (19.54) 
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Such a rule would modify the second equation in (19.50} to 

li' = [p + q - m(p)]p (19.55) 

and alter (19.52) to 

p — mip) - q [p - 0 curve under conventional monetary rulel (19.56) 

Given that m(p) is monotonic, there exists only one value of p— say, p\ —that can satisfy 
this equation. Hence the new p.' - 0 curve must still emerge as a vertical straight line, 
although with a different horizontal intercept />i - m{p\) - (/.Moreover, from (19.55). we 
find that 

= [1 ~m\p)]p > 0 [by (19.54)] 

which is qualitatively no different from the derivative in (19.53). It follows that the direc¬ 
tional arrows must also remain as they arc in Fig. 19.4a. In short, we would end up with a 
vortex as before. 

The alternative monetary rule proposed by Obsl is to gear m to the rate of change (rather 
than the level) of the rate of inflation: 

m = m(p) 0 [alternative monetary rule] (19.57) 

Under this rule, (19.55) and (19.56) will become, respectively, 

/(' = [p + q - m(p')\ii (19.58) 

p — mip') - q [/(' = 0 curve under alternative monetary rule] (19.59) 

This time the /T = 9 curve would become upward-sloping. For, differentiating (19.59) with 
respect to // via the chain rule, we have 

^L= m \p')iL=m(p'){-h)> 0 [by (19.50)] 
d\L ctfl 

so, bv the inverse-function rule, cl(i/clp the slope of the it = 0 cruve—is also positive. 
This new situation is illustrated in Fig. 19.4/;, where, for simplicity, the = 0 curve is 
drawn as a straight line, with an arbitrarily assigned slope/ Despite the slope change, the 
partial derivative 

— =/:<> 0 [from (19.58)] 
dp 

is unchanged from (19.53), so the p arrows should retain their original orientation in 
Fig. 19.4a. The streamlines (only one of which is shown) will now' twirl inwardly toward 
the equilibrium at JI — 1 and = where m( 0) denotes m(p ') evaluated at 

p = 0. Thus the alternative monetary rule is seen to be capable of converting a vortex into 
a stable focus, thereby making possible the asymptotic elimination of the perpetual fluctu¬ 
ation in the rate of inflation. Indeed, with a sufficiently flat p = 0 curve, it is even possible 
to turn the vortex into a stable node. 

* The slope is inversely proportional to the absolute value of m'(p). The more sensitively the rate of 
monetary expansion m Is made to respond to the rate of change of the rate of inflation p\ the flatter 
the p =0 curve will be in Fig. 19.4b. 
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EXERCISE 19.5 

1. Show that the two-variable phase diagram can also be used, if the model consists of a 
single second-order differential equation, y"(t) = f(y', y), instead of two first-order 
equations. 

2. The plus and minus signs appended to the two sides of the x = 0 and y' = 0 curves in 
Fig. 19.1 are based on the partial derivatives Sx'/dx and &y'/dy, respectively. Can the 
same conclusions be obtained from the derivatives dx'/dy and dy'/dx? 

3. Using Fig. 19.2, verify that if a streamline does not have an infinite (zero) slope when 
crossing the x' = 0 (y' - 0) curve, it will necessarily violate the directional restrictions 
imposed by the xy arrows. 

4. As special cases of the differential-equation system (19.40), assume that 

(o) f K = 0 f y > 0 g„ > 0 and g y = 0 

(b) f t = 0 f y < 0 g* < 0 and g y =0 

For each case, construct an appropriate phase diagram, draw the streamlines, and 
determine the nature of the equilibrium. 

5. (o) Show that it is possible to produce either a stable node or a stable focus from the 

differential-equation system (19.40), if 
f x < 0 fy > 0 g* < 0 and g Y < 0 
(b) What special feature(s) in your phase-diagram construction are responsible for the 
difference in the outcomes (node versus focus)? 

6. With reference to the Obst model, verify that if the positively sloped p! - 0 curve in 
Fig. 19 .4b is made sufficiently flat, the streamlines, although still characterized by 
crossovers, will converge to the equilibrium in the manner of a node rather than a 
focus. 


19.6 Linearization of a Nonlinear 

_ Differential-Equation System _ 

Another qualitative technique of analyzing a nonlinear differential-equation system is to 
draw inferences from the linear approximation to that system, to be derived from the Taylor 
expansion of the given system around its equilibrium. 1 We learned in Sec. 9.5 that a linear 
(or even a higher-ordei polynomial) approximation to an arbitrary function 0 (x) can give 
us the exact value of <j)(x) at the point of expansion, but will entail progressively larger 
errors of approximation as we move farther away from the point of expansion. The same is 
true of the linear approximation to a nonlinear system. At the point of expansion—here, the 
equilibrium point E —the linear approximation can pinpoint exactly the same equilibrium 
as the original nonlinear system. And in a sufficiently small neighborhood of £, the linear 
approximation should have the same general streamiine configuration as the original sys¬ 
tem. As long as we are willing to confine our stability inferences to the immediate neigh¬ 
borhood of the equilibrium, therefore, the linear approximation could serve as an adequate 
source of information. Such analysis, referred to as local stability analysis, can be used 

T lr> the case of multiple equilibria, each equilibrium requires a separate linear approximation. 
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either by itself, or as a supplement to the phase-diagram analysis. We shall deal with the 
two-variable case only. 


Taylor Expansion and Linearization 

Given an arbitrary (successively differentiable) one-variable function </>(x), the Taylor 
expansion around a point xq gives the series 


<p{x) = 0(x o ) + <p' (xo)(x - Xq) 4 


0"(Xp) 

2 ! 


(x - -to) 2 + 


where a polynomial involving various powers of (x - xq) appears on the right. A similar 
structure characterizes the Taylor expansion of a function of two variables fix, y) around 
any point (xo, yo). With two variables iti the picture, however, the resulting polynomial 
would comprise various powers of (y - yo) as well as (x - xq) —in fact, also the products 
of these two expressions: 

fix, v) = /(x 0 , yo) + M* o, yf(x - x 0 ) + f y (x 0 , yo)(y - >'o) 

+ »>)(x - x (1 ) 2 + 2 f sy (xo, yo)(x - x 0 )(y - yo) 

+ fyy(x 0 , >x))(y - yo) 2 ] + -■■ + *„ (19.60) 

Note that the coefficients of the lx -x 0 ) and (y -y 0 ) expressions are now the partial 
derivatives off all evaluated at the expansion point (xo, yo). 

From the Taylor series of a function, the linear approximation—or linearization for 
short—is obtained by simply dropping all terms of order higher than one. Thus, for the one- 
variable case, the linearization is the following linear function ofx: 

0(X O ) + 0'{Xq)(X ~ Xu) 

Similarly, the linearization of {19.60) is the following linear function of x andy; 
fix 0 , yo) + fx(x„, y 0 )(x - x 0 ) + f y (x lh y 0 )(y - yo) 

Besides, by substituting the function symbol g for/in this result, we can also get the cor¬ 
responding linearization ofg(x,y). It follows that, given the nonlinear system 

*' = /(*>>’) 

/ = s(- T i y) 

its linearization around the expansion point (xo, yo) can be written as 

x' = /(x 0 , yo) + fxlxu, y 0 )(x - x 0 ) -I- f y (x : 0 , yo)(y - yo) 
y = g(x o, yo) + yo)(x - x 0 ) + g\.(xo, yo)(y - y 0 ) 

If the specific forms of the functions f and g are known, then /(xo,y«), / r (xo.yo), 
f v (x o, yo) and counterparts for the g function can all be assigned specific values and 
the linear system (19.62) solved quantitatively. However, even if the/andg functions are 
given in general forms, qualitative analysis is still possible, provided only that the signs of 
/*• fy 9 gr- andg t arc ascertainable. 


(19.61) 

(19.62) 
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The Reduced Linearization 

For purposes of local stability analysis, the linearization (19.62) can be put into a simpler 
form. First, since our point of expansion is to be the equilibrium point (I, y), we should re¬ 
place (Xfj,}'o) by (x. y). More substantively, since at the equilibrium point we have 
x' = / = 0 by definition, it follows that 

/(*,y) = £(x,J)-0 [by (19.61)] 

so the first term on the right side of each equation in (19.62) can be dropped. Making these 
changes, then multiplying out the remaining terms on the right of (19.62) and rearranging, 
we obtain another version of the linearization: 


x' ~ /,(*, >0* - f y (x,y)y - -f x (x,y)x - f v (x, y)y 
/ - SA~, y)x ~ g y (x, y)y = -g x (x, y)x - g y (x, y)y 


(19.63) 


Note that, in (19.63), each term on the right of the equals signs represents a constant. We 
took the trouble to separate out these constant terms so that we can now drop them all. to 
get to the reduced equations of the linearisation. The result, which may be written in matrix 
notation as 


> “ 

* 


' fx 

fi 


X 

1 

L y. 

i 

m gx 

S.rj 

(xa) '■ 

m y m 


0 

0 


(19.64) 


constitutes the reduced linearization of (19.61). Inasmuch as qualitative analysis depends 
exclusively on the knowledge of the characteristic roots, which, in turn, hinge only on 
the reduced equations of a system, (19.64) is all wc need for the desired loeal stability 
analysis. 

Going a step further, it may be observed that the only distinguishing property of the 
reduced linearization lies in the matrix of partial derivatives* the Jacobian matrix of the 
nonlinear system (19.61)—evaluated at the equilibrium (x, y). Hence, in the final analysis, 
the local stability or instability of the equilibrium is predicated solely on the makeup of the 
said Jacobian. For notational convenience in the ensuing discussion, we shall denote the 
Jacobian evaluated at the equilibrium by .4 and its elements by a, ft, c, and d\ 


■./, // 


a b 

gx gy_ 

<.T,V) 

c d 


(19.65) 


It will be assumed that the two differential equations are functionally independent. Then wc 
shall always have 1^1 / 0. (For some cases where |.4;| = 0, see Exercise 19.6-4.) 


Local Stability Analysis 

According to (19.16), and using (19.65), the characteristic equation of the reduced 
linearization should be 

= r 2 - (a + d)r + (ad - be) = 0 

It is clear that the characteristic roots depend critically on the expressions (a + d) and 
(ad - be). The latter is merely the determinant of the Jacobian in (19.65): 

ad - be = \J E \ 


r - a -b 
-c r - d 
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And the former, representing the sum of the principal-diagonal elements of that Jacobian, 
is called the (race of.//,, symbolized by tr 4: 

a + d = tr ,4 


Accordingly, the characteristic roots can be expressed as 

tr J, ± v /(tr 4) 2 -4|4 i 
n > r 2 -- j - 

The relative magnitudes of (tr 4) 2 and4|4| will determine whether the two roots are real 
or complex, that is, whether the time paths ofx andyarc steady or fluctuating. To check the 
dynamic stability of equilibrium, on the other hand, \vc need to ascertain the algebraic signs 
of the two roots. For that purpose, the following two relationships will prove to be most 
helpful: 


f \ 4- /'2 = tr 4 
r\n - 141 


[of. (16.5) and (16.6)] 


(19.66) 

(19.67) 


Case 1 (tr J E f > A\J E \ in this case, the roots arc real and distinct, and no fluctuation is 
possible. Hence the equilibrium can be either a node or a saddle point, but never a focus or 
vortex. In view that r\ ^ r 2 , there exist three distinct possibilities of sign combination: both 
roots negative, both roots positive, and two roots with opposite signs. 1 Taking into account 
the information in (19.66) and (19.67), these three possibilities are characterized by: 

(/) rj < 0, r 2 < 0 => 141 > 0; tr 4 < 0 

07) n > 0, r 2 >0 => |4l>0;tr4>0 

(i77) /-i > 0,rj < 0 =4 |4l<0;tr4|0 

Under Possibility /, with both roots negative, both complementary functions x, and _v t tend 
to zero as t becomes infinite. The equilibrium is thus a stable node, The opposite is true 
under Possibility //, which describes an unstable node. In contrast, with two roots of 
opposite signs, Possibility iii yields a saddle point. 


To see this last case more clearly, recall that the complementary functions of the two 
variables under Case 1 take the general form 

x c = A iff 1 ' + A 2 e n> 

y c — k\A\e''' 4- kiA 2 e 

where the arbitrary constants A\ and 4 are to be determined from the initial conditions. If 
the initial conditions are such that A\ = 0, the positive root y will drop out of the picture, 
leaving it to the negative root r 2 to make the equilibrium stable. Such initial conditions per¬ 
tain to the points located on the stable branches of the saddle point. On the other hand, if 
the initial conditions are such that A 2 = O, the negative root r 2 will vanish from the scene, 
leaving it to the positive root y to make the equilibrium unstable. Such initial conditions 
relate to the points lying on the unstable branches. Inasmuch as all the other initial condi¬ 
tions also involve A\ ^ 0, they must all give rise to divergent complementary functions, 
too. Thus Possibility iii yields a saddle point. 

f Since we have ruled out |/f | = 0, no root can take a zero value. 
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Case 2 (tr J { ) 2 — 4|./ £ | A.s the roots are repeated in this ease, only two possibilities of sign 
combination can arise: 

(it)) r < 0 ,0 < 0 =s \Jf. \ > 0: ir ,4 < 0 

(i>) r >0,r 2 >0 =*■ |y /; | > 0; tr J t: > 0 

These two possibilities are mere duplicates of Possibilities / and ii. Thus they point to a 
stable node and an unstable node, respectively, 

Case 3 (tr/ £ ) 2 ■< 4|J fi | This time, with complex roots h±vi. cyclical fluctuation is pre¬ 
sent, and we must encounter either a focus or a vortex. On the basis of{ 19.66) and (19.67). 
we have in the present case 

tr J E = r| — r 2 = (h + vi) -h (A — vi) = 2 h 

= >V'2 - (It — vi)(h - vi) = Ir + r 

Thus tr J E has to take the same sign as h, whereas .41 is invariably positive. Consequently, 
there are three possible outcomes: 

(m) h < 0 => \J E \> 0 ; tr J K <. 0 

i vii) h > 0 => | J E | > 0: tr J E > 0 

( viii) h = 0 => |.}/. | > 0: tr.//; — 0 

These are associated, respectively, with damped fluctuation, explosive fluctuation, and 
uniform fluctuation. In other word, Possibility vi implies a stable focus: Possibility vii. an 
unstable focus; and Possibility viii. a vortex. 

The conclusions from the preceding discussion are summarized in Table 19.1 to facili¬ 
tate qualitative inferences from the signs of \Jr.\ and tr J E . Three features of the table are 
especially noteworthy. First, a negative |,4I is exclusively tied to the saddle-point type of 
equilibrium. This suggests that |.4| < •) is a necessary-and-suflicient condition for a 
saddle point. Second, a zero value for tr J E occurs only under two circumstances when 
there is a saddle point or a vortex. These two circumstances are. however, distinguishable 
from each other by the sign of |.4I- Accordingly, a zero tr ,4 coupled with a positive |.4 | 
is necessary-and-sufficient for a vortex. Third, while a negative sign for tr ,4 is necessary 
for dynamic stability, it is nut sufficient, on account of the possibility of a saddle point. 


TABLE 19.1 

Local Stability 
Analysis of a 

Case 

Sign of 

\h\ 

Sign of 

Type of 
Equilibrium 

Two-Variable 

1 ■ (tr I E ) 2 > 4jjfi 

+ 

— 

Stable node 

Nonlinear 


+ 

+ 

Unstable node 

Differential- 


— 

+, 0, - 

Saddle point 

Equation 

2- (tr ! c ) ! = 4\J e ’ 

+ 

_ 

Stable node 

System 

+ 

+ 

Unstable node 


3.(tr i t ) 2 < Mh\ 

+ 

— 

Stable focus 


+ 


Unstable focus 



+ 

0 

Vortex 
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Nevertheless, when a negative tr Jg is accompanied by a positive |*4|, wc do have a 
necessary-and-suffieient condition for dynamic stability. 

The discussion leading to the summary in Table 19.1 has been conducted in the context 
of a linear approximation to a nonlinear system. However, the contents of that table are ob¬ 
viously applicable also to the qualitative analysis of a system that is linear to begin with. In 
the latter case, the elements of the Jacobian matrix will be a set of given constants, so there 
is no need to evaluate them at the equilibrium. Since there is no approximation process 
involved, the stability inferences will no longer be “local” in nature but will have global 
validity. 


Example 1 


Analyze the local stability of the nonlinear system 


x' = f(x, y) = xy -2 
y' = g(x,y) = 2x-y 


(X, y > 0) 


First, setting x' = y' — 0, and noting the nonnegativity of x and y, we find a single equilib¬ 
rium fat (x,y) = (1,2). Then, by taking the partial derivatives of x’ and y', and evaluating 
them at £, we obtain 


' h 



> *' 


'2 r 

.9x 

9y . 

(3,7) 

2 -1 

(1,2) 

2 -i. 


Since |/ f | = -4 is negative, we can immediately conclude that the equilibrium is locally a 
saddle point. 

Note that while the first row of the|acobian matrix originally contains the variables y and 
x, the second row does not. The reason for the difference is that the second equation in the 
given system is originally linear, and requires no linearization. 


Example 2 


Given the nonlinear system 


x = x 2 — y 

y’ = i-y 


we can, by setting x‘ = y' = 0, find two equilibrium points: £i = (1,1) and £2 = (-1,1). 

2 x -11 

Thus we need two separate linearizations. Evaluating the Jacobian I „ .at the two 


0 - 


equilibriums in turn, we obtain 


/ £1 


2 

0 


and 


hi = 


-2 -1 

0 -1 


The first of these has a negative determinant; thus £ i = (1,1) is locally a saddle point. From 
the second, we find that |/ f2 | = 2 and tr l [2 = - 3. Hence, by Table 19.1, £2 = (-1,1) is 
locally a stable node under Case 1. 


Example 3 


Does the linear system 

x' = x-y+2 
/ =x +y +4 

possess a stable equilibrium? To answer such a qualitative question, we can simply concen¬ 
trate on the reduced equations and ignore the constants 2 and 4 altogether. As may be 

expected from a linear system, the Jacobian 


has as its elements four constants. 
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Inasmuch as its determinant and trace are both equal to 2, the equilibrium falls under 
Case 3 and is an unstable focus. Note that this conclusion is reached without having to solve 
for the equilibrium. Note, also, that the conclusion is in this case globally valid. 

Example 4 Analyze the local stability of the Obst model (19.50), 

” P' = M 1-/0 

p' = (p + q-m)p 

assuming that the rate of monetary expansion m is exogenous (no monetary rule is fol¬ 
lowed). According to Fig. 19,4a, the equilibrium of this model occurs at E = (p,p) = 
(m - q, 1). The lacobian matrix evaluated at £ is 

" dp' dp' 
dp 

f (III dp 

3 p Bix 

Since |/ f | = h > 0, and tr / f = 0, Table 19.1 indicates that the equilibrium is locally a vor¬ 
tex. This conclusion is consistent with that of the phase-diagram analysis in Sec. 19,5, 


■c 

o 


'0 

-h' 

{X p+q-m m 

(m-q,)) 

i 

0 


Example 5 


Analyze the local stability of the Obst model, assuming that the alternative monetary rule is 
as follows: 


p = h (1 — /i) [from (19.50)] 

n' = [p + q-m(p ! }]p [from (19.58)] 

Note that since ff is a function of p, the function m (p') is in the present model also a 
function of p. Thus we have to apply the product rule in finding dp'/dp. At the equilibrium 
£, where = p = 0, we have p = 1 and p = m (0) - q. The lacobian evaluated at f is, 
therefore, 


'0 -h 


'0 -h ' 

P p + q-m(p’)-m'(p'){-h)p_ 

i 

!l m'(0)b 


where m'(0) is negative by (19.57). According to Table 19.1, with )/ £ i = h>0 and 
tr j t = m r (0)b < 0, we can have either a stable focus or a stable node, depending on the 
relative magnitudes of (tr / £ ) J and 4|J £ |. To be specific, the larger the absolute value of the 
derivative m'(0), the larger the absolute value of tr l £ will be and the more likely (tr / f ) 2 will 
exceed 4|/ f |, to produce a stable node instead of a stable focus. This conclusion is again 
consistent with what we learned from the phase-diagram analysis. 


EXERCISE 19.6 


1, Analyze the local stability of each of the following nonlinear systems: 
(a) x' = e* - 1 (O x‘ = 1 - e* 


y' = ye" 

(b) x' = x + 2y 
y' = x 2 + y 


y' = sx-y 

(d) x' = a 3 + 3 x 2 y+y 

y’ = *0 + y 2 ) 
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2. Use Table 19.1 to determine the type of equilibrium a nonlinear system would have lo¬ 
cally, given that: 


(a) f,= 0 

fy > 0 

o 

A 

and 

g Y = 0 

(b) 1, = 0 

f v <0 

£ 

A 

O 

and 

9y = ° 

<c) /,< 0 

fy> C 

g< < o 

and 

X 

A 

o 


Are your results consistent with your answers to Exercises 19,5-4 and 19.5-5? 

3. Analyze the local stability of the Obst model, assuming that the conventional monetary 
rule is followed. 

4. The following two systems both possess zero-valued |acobians. Construct a phase 
diagram for each, and deduce the locations of all the equilibriums that exist: 

(d)x'^x+y (P)x' = 0 

y' = -x-y y' = 0 




Optimal Control Theory 


At the end of Chap. 13, we referred to dynamic optimization as a type of problem we were 
not ready to tackle because we did not yet have the tools of dynamic analysis such as 
differential equations. Now that wc have acquired such tools, we can finally try a taste of 
dynamic optimization. 

The classical approach to dynamic optimization is called the calculus of variations. In 
the later development of this methodology, however, a more powerful approach known as 
optimal control theory has, for the most part, supplanted the calculus of variations. For this 
reason, we shall, in this chapter, confine our attention to optimal control theory, explaining 
its basic nature, introducing the major solution tool called the maximum principle, and 
illustrating its use in some elementary economic models. 1. 


20.1 The Nature of Optimal Control __ 

In static optimization, the task is to find a single value for each choice variable, such that a 
stated objective function will be maximized or minimized, as the case may be. Such a prob¬ 
lem is devoid of a time dimension, In contrast, time enters explicitly and prominently in a 
dynamic optimization problem. In such a problem, we will always have in mind a planning 
period, say from an initial time t = 0 to a terminal time t = T. and try to find the best 
course of action to take during that entire period. Thus the solution for any variable will 
take the form of not a single value, but a complete time path. 

Suppose the problem is one of profit maximization over a time period. At any point of 
time t. we have to choose the value of some control variable, u(t ), which will then affect 
the value of some state variable, y(f), via a so-called equation of motion. In turn, y{t) will 
determine the profit 7 i{t). Since our objective is to maximize the profit over the entire pe¬ 
riod, the objective function should take the form of a definite integral of n from t = 0 to 
t = T. To be complete, the problem also specifies the initial value of the state variable \\ 


T For a more complete treatment of optimal control theory (as well as "calculus of variations"), the 
student is referred to Elements of Dynamic Optimization by Alpha C. Chiang, McGraw^Hfll, New York, 
1992, now published by Waveland Press, Inc., Prospect Heights, Illinois. This chapter draws heavily 
from material in this cited book. 
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y(0), and the terminal value of y,y{T), or alternatively, the range of values that y(T) is 
allowed to take. 

Taking into account the preceding, we can state the simplest problem of optimal control 
as: 

f T 

Maximize / F(t< v, u) dt 

J o 

subject to ^ v, u) ( 20 . 1 ) 

y(0) = A y(T) free 
and u(t) e U for all t € [0, T] 

The first line of (20.1), the objective function, is an integral whose integrand F(t,y y u) 
stipulates how the choice of the control variable u at time f, along with the resulting y at 
time r, determines our object of maximization at /. The second line is the equation of mo¬ 
tion for the slate variable v. What this equation does is to provide the mechanism whereby 
our choice of control variable u can be translated into a specific pattern of movement of the 
state variable y. Normally, the linkage between u andy can be adequately described by a 
first-order differential equation / = f(t 9 y, u). However, if it happens that the pattern of 
change of the state variable requires a second-order differential equation, then we must 
transform this equation into a pair of first-order differential equations. In that case an addi¬ 
tional state variable will be introduced Both the integrand F and the equation of motion are 
assumed to be continuous in all their arguments and possess continuous first-order partial 
derivatives with respect to the state variable y and the time variable t, but not necessarily the 
control variable u. In the third line, we indicate that the initial state, the value ofy at t = 0, 
is a constant A, but the terminal state y( T) is left unrestricted. Finally, the fourth line indi¬ 
cates that the permissible choices of u are limited to a control region U . It may happen, of 
course, that u(t ) is not restricted. 


illustration: A Simple Macroeconomic Model 

Consider an economy that produces output Y using capital K and a fixed amount of labor L, 
according to the production function 

Y = Y(K,L) 

Further, output is used either for consumption C or for investment I. If we ignore the prob¬ 
lem of depreciation, then 


In other words, investment is the change in capital stock over time. Thus we can also write 
investment as 


[=Y-C. = Y{K,L)-C= — 

di 


which gives us a first-order differential equation in the variable K. 
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If our objective is to maximize some form of social utility over a fixed planning period, 
then the problem becomes 


Maximize 

f U(C)dt 

Jo 


subject to 


(20.2) 

and 

K(0) = K» K(T) = K t 



where K(, and Kj are the initial value and terminal (target) value of K. Note that in (20.2), 
the terminal state is a fixed value, not left free as in (20.1). Here C serves as the control 
variable and K is the state variable. The problem is to choose the optimal control path C(r) 
such that its impact on output Y and capital K, and the repercussions therefrom upon C 
itself, will together maximize the aggregate utility over the planning period. 

Pontryagin's Maximum Principle 

The key to optimal control theory is a first-order necessary condition known as the maxi¬ 
mum principled The statement of the maximum principle involves an approach that is akin 
to the Lagrangian function and the Lagrangian multiplier variable. For optimal control 
problems, these are known as the Hamiltonian function and costate variable, concepts we 
will now develop. 

The Hamiltonian 

In (20.1), there are three variables: time f, the state variable^, and the control variable u. We 
now introduce a new variable known as the costate variable and denoted by a((). Like the 
Lagrange multiplier, the costate variable measures the shadow price of the state variable. 

The costate variable is introduced into the optimal control problem via a Hamiltonian 
function (or Hamiltonian, for short). The Hamiltonian is defined as 

H(t, u , X) = f(t T v, u) + k(t)f(i>y> a) (203) 

where H denotes the Hamiltonian and is a function of four variables: /, v, u. and X. 

The Maximum Principle 

The maximum principle—the main tool for solving problems of optimal control- is so 
named because, as a first-order necessary condition it requires us to choose u so as to max- 
imize the Hamiltonian H at every point of time. 

Since, aside from the control variable, u, H involves the state variable y and costate 
variable a, the statement of the maximum principle also stipulates how y and X should 
change over time, via an equation of motion for the state variable y (state equation for 


f The term "maximum principle" is attributed to L. S. Pontryagin and his associates, and is often 
referred to as Pontryagin's maximum principle. See The Mathematical Theory of Optimal Control 
Processes by L. S. Pontryagin, V. C. Boltyanskii, R. V. Gamkrelidze, and E. F, Mishchenko, Interscience, 
New York, 1962 (translated by K. N. Trirogoff). 
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short) as well as an equation of motion for the costate variable A (costate equation for 
short). The state equation always comes as part of the problem statement itself, as in 
the second equation in (20.1). But in the view that (20.3) implies c)H/dX = f[t. y, u), the 
maximum principle describes the state equation 


>■' = ») as y 


UH 

3A 


(20.4) 


In contrast, a does not appear in the problem statement (20.1) and its equation of motion 
enters into the picture purely as an optimization condition, The costatc equation is 


/ dk\ _ dH 

V dt)--~ty 


(20.5) 


Note that both equations of motion are stated in terms of the partial derivatives of H, sug¬ 
gesting some symmetry, but there is a negative sign attached to 9 ff/9y in (20.5). 

Equations (20.4) and (20.5) constitute a system of two differential equations. Thus vve 
need two boundary conditions to definitize the two arbitrary constants that will arise in 
the process of solution. If both the initial state y( 0) and the terminal state y(T) are fixed, 
then these specifications can be used to definitize the constants. But if, as in problem (20.1}, 
the terminal state is not fixed, then something called a trcmsversality condition must be 
included as part of the maximum principle, to fill the gap left by the missing boundary 
condition. 

Summing up the preceding, we can state the various components of the maximum prin¬ 
ciple for problem (20.1) as follows: 


(/) 

(ii) 


( iv) 


H(Ly>u\>.) > H{Uy< k,A) 

v'= — 

9a 

Sy 

A<D = 0 


for all t e [0, T] 

(state equation) 

(costate equation) 
(transversality condition) 


( 20 . 6 ) 


Condition / in (20.6) states that at every time t the value of u(t), the optimal control, 
must be chosen so as to maximize the value of the Hamiltonian over all admissible values 
of u(t). In the case where the Hamiltonian is differentiable with respect to u and yields an 
interior solution, Condition i can be replaced by 

— =0 
9 k 


However, if the control region is a closed set, then boundary solutions are possible and 
SH/du = 0 may not apply. In fact, the maximum principle docs not even require the Hamil¬ 
tonian to be differentiable with respect to k. 

Conditions hand Hi of the maximum principle, y' = dH/dX and A' = —dH/dy, give us 
two equations of motion, referred to as the Hamiltonian system for the given problem. 
Condition iv, A (T) = 0, is the transversality condition appropriate for the free-terminal- 
state problem only. 
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Example 1 


To illustrate the use of the maximum principle, let us first consider a simple noneconomic 
example—that of finding the shortest path from a given point A to a given straight line. In 
Fig. 20.1, we have plotted the point A on the vertical axis in the ty plane, and drawn the 
straight line as a vertical one at t = T. Three (out of an infinite number of) admissible paths 
are shown, each with a different length, The length of any path is the aggregate of small 
path segments, each of which can be considered as the hypotenuse (not drawn) of a trian¬ 
gle formed by small movements dt and dy. Denoting the hypotenuse by dh, we have, by 
Pythagoras's theorem, 

dh 2 = dt 2 + dy 2 


Dividing both sides by dt 2 and taking the square root yields 


dh 

dt 




-l 1/2 


= [i +(yY] 


'\2lV2 


(20.7) 


The total length of the path can then be found by integrating (20.7) with respect to t, from 
t = 0 to f = T, If we let y' = u be the control variable, (20.7) can be expressed as 

^ = 0+u 2 ? 12 (20.7') 

To minimize the integral of (20.7’) is, of course, equivalent to maximizing the negative of 
(20.7'). Thus the shortest-path problem is: 

Maximize / -(] +u 2 ) 1/2 df 

J o 

subject to y' = u 


and y(0) = A y(T) free 

The Hamiltonian for the problem is, by (20.3), 

W = -(l+u 2 ) 1/2 + xu 

FIGURE 20.1 A 



0 


T l 
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Example 2 


Since H is differentiable in u, and u is unrestricted, the following first-order condition can be 
used to maximize H: 


or 


^ = -l(l+u 2 r ,/ 2 (2i/) + i«0 

A 

u (0 = a (1 - a 2 )" 1 ' 2 


Checking the second-order condition, we find that 


S 2 H 
i )u 2 


(1 + u 2 )" 3/2 < 0 


which verifies that the solution to u(t) does maximize the Hamiltonian. Since u(r) is a func¬ 
tion of a, we need a solution to the costate variable. From the first-order conditions, the 
equation of motion for the costate variable is 

I)H „ 

x = - =0 

since H is independent of y. Thus, X is a constant. To definitize this constant, wc can make 
use of the transversality condition X(T) = 0. Since A can take only a single value, now 
known to be zero, we actually have x(t) = 0 for all t. Thus we can write 

>.*(f) = 0 for all fe [0,7] 

It follows that the optimal control is 

u'(f) = A *[1 -(r) 2 ]- 1/2 = o 

Finally, using the equation of motion for the state variable, we see that 

y =u = 0 

or y*(t) = Co (a constant) 

Incorporating the initial condition 

y(0) - a 

we can conclude that Co = A, and write 

f(t) = A for all t 

In Fig. 20.1, this path is the line AB. The shortest path is found to be a straight line with a 
zero slope. 


Find the optimal control path that wi 

Maximize 

subject to 
and 


/ (y-u 2 )d( 

JQ 

Y' = » 

y<0) = 5 /(l) free 


This problem Is in the format of (20.1), except that u is unrestricted. 
The Hamiltonian for this problem. 


H = y- u 2 +ku 
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Is concave in u, and u is unrestricted, so we can maximize H by applying the first-order con¬ 
dition {also sufficient because of concavity of H): 


AH 

Hu 


2u + X = 0 


which gives us 


The equation of motion for X is 


, . A , A 

u(t) = - or y = - 


v=-^=-i 

9y 


( 20 . 8 ) 


( 20 . 8 ') 


The last two equations constitute the differential-equation system for this problem. 

We can first solve for X by straight integration of (20.8') to get 

A(r) = Ci - t (Ci arbitrary) 

Moreover, by the transversality condition in (20.6), we must have A(1) = 0. Setting t = 1 in 
the last equation yields ci = 1. Thus the optimal costate path is 


(f) = 1 - f 


It follows that y' = j(1 - t), by (20.8), and by integration, 

y(t) = ^£-^£ 2 + C2 (C2 arbitrary) 

The arbitrary constant can be definitized by using the initial condition y(0) = 5. Setting 
f = 0 in the preceding equation, we get 5 = y(0) = C 2 . Thus the optimal path for the state 
variable is 

and the corresponding optimal control path is 


u’(r)=- 2 (1-t) 


Example 3 


Find the optimal control path that will 


Maximize 

jf (2y-3u)dt 

subject to 

y' = y+u 


y(0) = 4 y(2) free 

and 

40 €[0,2] 


The fact that the control variable is restricted to the closed set [0, 2] gives rise to the possi¬ 
bility of boundary solutions. 

The Hamiltonian function 


H = 2y- 3u +A(y+ u) = (2 +A)y+(A - 3)u 
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is linear in u. If we plot H against u in the uH plane, we get a straight line with slope 
dH/du = k - 3, which is positive if X > 3 (Line 1), but negative if k < 3 (Line 2), as illus¬ 
trated in Fig, 20,2. If at any time k exceeds 3, then the maximum H occurs at the upper 
boundary of the control region and we must choose u - 2. If, on the other hand, k falls 
below 3, then in order to maximize H, we must choose u = 0, In short, u“(f) depends on 
X(f) as follows: 

u*(f)=|o| if A-CO j ^ } 3 < 20 - 9) 

Thus, it is critical to find 1(f). To do this, we start from the costate equation 

1 ' = -—= - 2-1 or l' + l = -2 
9y 

The general solution of this equation is 

1 (f) = Ae '-2 [by(15.5)] 

where A is an arbitrary constant. By using the transversality condition 1(T) = 1(2) = 0, we 
find that A = 2t? 2 . Thus the definite solution for l is 

i*(t) = 2e 2_! - 2 (20.10) 

which is a decreasing function of f, falling steadily from the initial value A“(0) = le 1 - 2 - 
12.778 to a terminal value A*(2) = 2e° - 2 = 0. This means that A* must passthrough the 
point 1 = 3 at some critical time r, when the optimal u has to be switched from u” = 2 to 
u* = 0 . 

To find this critical time r, we set l*(r) = 3 in (20.10): 

3 = l‘(r) = 2e 2 " 1 - 2 or e 2 "’ = j=2.S 

Taking the natural log of both sides, we get 

Ine 2- " = In 2.5 or 2-r = ln2.5 
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Thus 


r = 2 - In 2.5 = 1.0S4 (approx.) 

and the optimal control turns out to consist of two phases in the time interval [0, 2]: 
Phase 1: u*[0, r) = 2 Phase 2: u*[r, 2] = 0 


20.2 Alternative Terminal Conditions 


What happens to the maximum principle when the terminal condition is different from the 
one in (20.1)? In (20.1), we face a vertical terminal line—with a fixed terminal time but 
unrestricted terminal state as illustrated in Fig. 20.1, The maximum principle for the max¬ 
imization problem requires that 


(0 H(t,y,u*,A)>H(t,y,u,k) 

dk 


(u) 


(Hi) )J 


SH_ 

9 y 


with the transversality condition 


for all / e [0, 7'] 


(iv) HT)= 0 


With alternative terminal conditions, Conditions /, ii, and iii will remain the same, but Con¬ 
dition iv (the transversality condition) must be duly modified. 


Fixed Terminal Point 

If the terminal point is fixed so that the terminal condition is y{T) = yr with both / and >•/ 
given, then the terminal condition itself should provide the information to definitize one 
constant. In this case, no transversality condition is needed. 


Horizontal Terminal Line 

Suppose that the terminal state is fixed at a given target level yj but the terminal time T is 
free, so that we have the flexibility to reach the target in a hurry or at a leisurely pace. We 
then have a horizontal terminal line as illustrated in Fig. 20.3a, which allows us to choose 
between 7\, 7), 7), or other terminal times to reach the target level ofy. For this case, the 
transversality condition is a restriction on the Hamiltonian (rather than the costate variable) 
at t = T : 


H !=t = 0 ( 20 . 11 ) 

Truncated Vertical Terminal Line 

If wc have a fixed terminal time T, and the terminal state is free but subject to the proviso 
that y T > y m j„, where y m j n denotes a given minimum permissible level of v, we face a trun¬ 
cated vertical terminal line, as illustrated in Fig. 20.3b. 
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The transversality condition for this case can be stated like the complementary- 
slackness condition found in the Kuhn-Tuckcr conditions; 

HT)>d y T > j' min (y? - y m in) MT) = 0 (20.12) 

The practical approach for solving this type of problem is to first try k(T) = 0 as the 
transversality condition and test if the resulting yf satisfies the restriction y r T > y niin . If so, 
the problem is solved. If not, then treat the problem as a given terminal point problem with 
y mil) as the terminal state. 

Truncated Horizontal Terminal Line 

When the terminal state is fixed at >y and the terminal time is free but subject to the re¬ 
striction T < T m , 1X , where T im> , denotes the latest permissible time (a deadline) to reach 
the given yr, we face a truncated horizontal terminal line as illustrated in Fig. 20.3c. The 
transversality condition becomes 

> 0 T<T imK .,-0 (20,13) 

This again appears in the format of the complementary-slackness condition. 

The practical approach to solving this type of problem is to try H,=r m „ = 0 first. If the 
resulting solution value is 7"* < T ma , then the problem is solved, if not. then we must take 
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T mm as a fixed terminal time which, together with the given yr, defines a fixed end point, 
and solve the problem as a fixed-end-point problem. 

Example 1 In the problem 

Maximize / (y-i/)clt 

Jo 

subject to y' = u 

and y(0) = 2 y(1) = o 

the terminal point is fixed, even though y{ 1) is assigned a parametric rather than numerical 
value here. 

The Hamiltonian function 

H = y-u 2 + ku 

is concave in u, so we can set r )H/8u = 0 to maximize H: 

dH 

— = -2u + a = 0 

r iu 

Thus 


A 

° ~ 2 

which shows that in order to solve for u(t), we need to solve for x(t) first. 

The two equations of motion are 

/(=w)=^ 

Direct integration of the last equation yields 

X(t) = Ci - t (ci arbitrary) 

which implies that 

y 1 1 
V 2 C1 “2* 

Again, by direct integration, we find that 

y(t)= < ±t- ] -t 2 + c 2 (c 2 arbitrary) 

To definitize the two arbitrary constants, we make use of the initial condition y(0) = 2, and 
the terminal condition y(1) = a. Setting f = 0 and f = 1, successively, in the preceding 
equation, we obtain 

2 = y(0) = c 2 o=k0)= + 


7 

2 ’ 


Thus, C 2 = 2, and C\ = 2a 
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Example 2 


Therefore, we can write the optimal paths of this problem as: 

m = {a - 7 - )[ - l -t 2 +2 

r(f) = 2a-j-t 


The problem 

Maximize j ~(t 2 + u 2 )dt 

subject to y' = u 

and y(0)4 y(T) = 5 Tfree 

exemplifies the case of horizontal terminal line where the terminal state is fixed but the time 
of arrival at the target level of y is unrestricted. In fact, it is one of our tasks to solve for the 
optimal value of T. 

Since the Hamiltonian 

H = -t 2 -u 2 + ku 


is concave in u, we can again maximize H by using the first-order condition 


'dH 

Hu 


— —2 u + x = 0 


which gives us 

(20.14) 

The concavity of H makes it unnecessary to check the second-order condition, but if we 
wish, it is easy to check that d 2 H/du 2 = -2 < 0, sufficient for a maximum of H. 

The equation of motion for k is 

;.' = - 8H =0 

ay 

which implies that X is a constant, But we cannot yet determine its exact value at this point. 
Turning to the equation of motion for y, 

V' = u= | [by (20.14)] 

we can obtain, by direct integration, 

VCO - 11 + ^ (20.15) 

Since y(0)=4, we see that c = 4. Furthermore, the transversality condition (20.11) 
requires that 

Her = -' T 2 - j + j = -T 2 + j = 0 [by (20.14)] 
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Solving the preceding equation for T, and taking the positive square root, we get 

T = ^ (20.16) 

Since X is constant, so is T . We try now to find its exact value. 

Applying the terminal-state specification ytT) = 5 to (20.15), and recalling that c = 4, 
we get 

y(n = ^-M = 5 

In view of (20.16), the last equation can be rewritten as T 2 = 1. Thus, by taking the square 
root, we can determine the optimal arrival time to be 

P - 1 (negative root unacceptable) 

From this, we can readily deduce that 

r(f) = 27* = 2 [by ( 20 . 16 )] 

= ^ = 1 [by (20.14)] 
y*(t) = f + 4 [by (20.15)] 

The last result shows that, in this example, the optimal y path is a straight line going from 
the given initial point to the horizontal terminal line. 


EXERCISE 20.2 

Find the optimal paths of the control, state, and costate variables that will 

1. Maximize 

1 (y-u*)dt 

J 0 

subject to 

and 

y{ 0) = 2 y(1)free 

2. Maximize 

j 6ydt 

subject to 

y‘ = y + u 

and 

y(0) = 10 y(8)free 

<X*)e[0,2] 

3. Maximize 

f -(au+bu 2 )dt 

Jo 

y' = y-u 

subject to 

and 

y(0) = y 0 x( 0 free 

4. Maximize 


subject to 

y' = u 

and 

y(0) = yo y(t)f<se 
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/ 20 1 

5. Maximize / -~u 2 dt 

Jo 2 


subject to 

y' = u 


and 

HO) = io 

H20) = 0 

6. Maximize 

r- 


subject to 

y' = y + u 



HO) = 5 

H 4 ) 2 300 

and 

0 < HO < 2 


7. Maximize 

r-- 


subject to 

y' = y + u 


and 

H°) = l 

Hi)-0 

8, Maximize 

f\y + ut- 

u 2 )dt 

subject to 

/ = u 


and 

Hi) = 3 

H 2) = 4 

9. Maximize 


-a^dt 

subject to 

y' = u+y 


and 

X 

II 

H2) free 


20,3 Autonomous Problems _ 

]n the general control problem framework, the variable t can enter the objective function 
and state equation directly. The general specification 

Maximize I F(l,v,u)dt 

Jo 

subjects v' = /(f,v, t<) 

and boundary conditions 


where I explicitly enters into /and/means the date matters. That is. the value generated by 
the activity u(t) depends not only on the level, but also on exactly when this activity takes 
place. 

Problems in which t is absent from the objective function and state equation such as 


Maximize 

subject to 
and 


F(y, u) dr 


}■' = /(V, M) 


boundary conditions 
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are called autonomous problems . In such problems, since the Hamiltonian 

H = F(y, u) + A/(y, u) 

does not contain t as an argument, the equations of motion are easier to solve; moreover, 
they are amenable to the use of phase-diagram analysis. 

In still other cases, in an otherwise autonomous problem, time t enters into the picture 
as part of the discount factor e~ n . but nowhere else, so that the objective function takes the 
form of 

/ 

G(y, u)e~ fl cit 

Strictly speaking, this problem is nonautonomous. However, it is easy to convert the prob¬ 
lem into an autonomous one by employing the so-called current-value Hamiltonian, 
defined as: 

// c . = He rt = G(y\ u) + nf{y, u) (20.17) 



where 




he 


rt 


(20,18) 


is the current-value Lagrange multiplier By focusing on the current (undiscounted) value, 
we are able to eliminate t from the original Hamiltonian. 

Using Ji c in lieu of H, we must revise the maximum principle to: 

(0 tffO F) > H e {y, u, p) for all t e [0, T) 


(ii) y' = 


Hi ) i 1 


a h c 

3/r 

BH l . 


8y 

(<V) n(T) = 0 


-I -rp 


or [H c ] l=r = 0 


(for vertical terminal line) 
(for horizontal terminal line) 


(20,19) 


20.4 Economic Applications 


Lifetime Utility Maximization 

Suppose a consumer has the utility function £7(C(/)), where C{t) is consumption at time t. 
The consumer’s utility function is concave, and has the following properties; 

U' > 0 U" < 0 

The consumer is also endowed with an initial stock of wealth, or capital, A'o. with income 
stream derived from the stock of capital according lo the following: 

Y = rK 

where r is the market rate of interest. The consumer uses the income to purchase C. In ad¬ 
dition, the consumer can consume the capital stock. Any income not consumed is added to 
the capital stock as investment. Thus, 

K' = f = Y — C = rK - C 
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The consumer’s lifetime utility maximization problem is to 

fT 


Maximize / U(C{t))e u dt 

Jo 

subject to K' = rK(t) - C{t) 
and K{0) = K G K(T)> 0 


where <5 is the consumer’s personal rate of time preference (5 > 0). It is assumed that 
C(t) > 0 and K{t) > 0 for all t. 

The Hamiltonian is 


H = U(C(t))e- Sl + X(i) [r K(t) - C(r)] 


where C is the control variable, and K is the state variable. Since U(C) is concave, and the 
constraint is linear in C, we know that the Hamiltonian is concave and the maximization of 
H can be achieved by simply setting dH/dC = 0. Thus we have 


9H 

dc 


= U‘(C)e- s ‘ - X = 0 


( 20 . 20 ) 


K'=rK(t)-C(t) (20.20') 

X' = ~ = -rX. (20.20") 


Equation (20.20) states that the discounted marginal utility should be equated to the pre¬ 
sent shadow price of an additional unit of capital. Differentiating (20,20) with respect to r, 
we get 

U"(C)Ce~ M - W'{C)e~ k = X' (20.21) 

In view of (20,20) and (20.20") we have 

X' = —rX = - rU\C)e ~ a ' 
which can be substituted into (20.21) to yield 

U"(C)C'(t)e~* 1 - W'{C)e- Sl = -rU'(C)e~ Sl 
or, after canceling the common factor e~ Sc and rearranging, 


U'(C(t )) 


C'(t) = r-8 


Since U' > 0 and U" < 0, the sign of the derivative C'it) has to be the same as (r - S). 
Therefore, if r > 8, the optimal consumption will rise over time; if r c 5, the optimal con¬ 
sumption will decline over time. 

Solving (20.20") directly gives us 


MO = he~ rt 


where Xq > 0 is the constant of integration. Combining this with (20.20) gives us 

l/'(C(0) = Xe il = X^ r) ' 
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which shows that the marginal utility of consumption will optimally decrease over time if 
r > 5, but increase over time if r < 8. 

Since the terminal condition K[T) > 0 identifies the present problem as one with a 
truncated vertical terminal line, the appropriate transversality condition is, by (20.12), 

A(T)>0 K(T)>(I K(T)k(T) = Q 

The key condition is the complementary-slackness stipulation, which means that either the 
capital stock K must be exhausted on the terminal date, or the shadow price of capital A 
must fall to zero on the terminal date. By assumption, U'{C) > 0, the marginal utility can 
never be zero. Therefore, the marginal value of capital cannot be zero. This implies that the 
capital stock should optimally be exhausted by the terminal date T in this model. 

Exhaustible Resource 

Let n(t) denote a stock of an exhaustible resource and qU) be the rate of extraction at any 
time t such that 

s' = -q 

The extracted resource produces a final consumer good c such that 

c = c(q) where c > 0, c" < 0 ( 20 . 22 ) 

The consumption good is the sole argument in the utility function of a representative con¬ 
sumer with the following properties: 

U ^ U(c) where U' > 0, U" < 0 (20.22') 

The consumer wishes to maximize the utility function over a given interval [0, T], Since 
c is a function of q, the rate of extraction, q will serve as the control variable, for simplic¬ 
ity, wc ignore the issue of discounting over time. The dynamic problem is then to choose 
the optimal extraction rate that maximizes the utility function subject only to a nonnegaiiv- 
ity constraint on the state variable s(t), the stock of the exhaustible resource. The formula¬ 
tion is 

Maximize 

subject to 
and 

where s 0 and T are given. 

The Hamiltonian for the problem is 

H = U(c(q)) - Xq 

Since H is concave in q by model specifications on the U(c(q)) function, we can maximize 
//by setting dH/dq = 0: 

BH 

— = U'(c(q))c (q) - A = 0 (20.24) 

The concavity of H assures us that (20.24) maximizes H , but wc can easily check the 
second-order condition and confirm that H 2 HjHq 2 is negative. 


/ U(c(q))dt 
J o 

s' = ~q 

i’(0)=.*o s(T)> 0 


(20.23) 
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The maximum principle stipulates that 


which implies that 

k(t) = t'o a constant (20.2S) 

To determine co. we turn to the transversality conditions. Since the model specifics 
K(T)> 0, it has a truncated vertical terminal line, so (20.12) applies: 

k(T) > 0 s(T) >0 s(T)k(T)=0 

In practical applications, the initial step is to try k(T) = 0, solve for ty, and see if the solu¬ 
tion will work. Since k{T) is a constant, to try k(T) = 0 implies k(t) = 0 for all I. and 
tiH/dtj in (20.24) reduces to 

U'{c)c'(q) = 0 

which (in principle) can be solved for q. Since t is not an explicit argument of U or c. the 
solution path for q is constant over time: 

Now, we check ifq * satisfies the restriction.5(7") > 0. If q" is a constant, then the equa¬ 
tion of motion 

.f = -q 

can be readily integrated, yielding 

s(r) = -qt + C| [c\ = constant of integration] 

Using the initial condition j(0) = sq yields a solution for the constant of integration 

Cl — i'o 


and the optimal slate path is 

s{t)=so-q*t (20.26) 

Without specifying the functional forms for [/and c, no numerical solution can be found 
for</*. However, from the transversality conditions, we can conclude that if 5(7') > ft, then 
q* as derived in the solution is acceptable. But \i's(T) < 0 for the given q‘, then the ex¬ 
traction rate is too high and we need to find a different solution. Since the trial solution 
k{ T) = 0 failed, we now take the alternative of k{I) > 0. Even in this case, though, a is 
still a constant by (20.25). And (20.24) can still (in principle) yield a constant, but differ¬ 
ent, solution value <? 2 - It follows that (20.26) remains valid. But this time, with k(T) > 0, 
the transversality condition (20.12) dictates that .v(7") = 0, or in view of (20.26), 

Jo-? 2 “7' = 0 

Thus we can write the revised (constant) optimal rate of extraction as 
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This new solution value should represent a lower extraction rate that would not violate the 
s(T) > 0 boundary condition. 


EXERCISE 20,4 

1, Maximize f (K - aK 2 - l 2 )dt (« > 0) 

Jo 

subjeetto K'=I-8K (8 > 0) 

and K(0)=K 0 K(T) free 

2. Solve the following exhaustible resource problem for the optimal extraction path: 

Maximize / ln(g)e _,, 'df 
Jo 

subject to s’ = -q 

and j(0) = s 0 s(f )->0 


20.5 Infinite Time Horizon 


In this section we introduce the problem of dynamic optimization over an infinite planning 
period. Infinite time horizon models tend to introduce complexities with respect to trans- 
versality conditions and optimal time paths that differ from those developed earlier. Rather 
than address these issues here, we shall illustrate the methodology of such models with a 
version of the neoclassical optimal growth model. 

Neoclassical Optimal Growth Model 

The standard neoclassical production function expresses output Y as a function of two in¬ 
puts: labor L and capital A, Its general form is 

Y = Y{K,L ) 

where Y[K, i) is a linearly homogeneous function with the properties 

Yl > 0 >>, > 0 Yu. < 0 X o <0 

Rewriting the production function in per capita terms yields 

y=ip(k) with0'<A) >■ 0 and d>"{k) < 0 

where y = Y/L and k = K/L. Total output Y is allocated to consumption Cor gross in¬ 
vestment /. Let .5 be the rate of depreciation of the capital stock A. Then net investment or 
changes to the capital stock can be written as 

K' = 1 - = Y — C - UK 

Denoting per capita consumption as c = C/L , we can write as 

-A" = y -c-Sk 

is 


(20.27) 
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The right-hand side of (20.27) is in per capita terms, but the left-hand side is not. To unify, 
we note that 


. dk d dL dk 

K' = — = -(U) = t— + £ — 
dt dt dt dt 


If the population growth rate is* 

dl/dt 


dL 


— n so that — = nL 
L dt 


then (20.28) becomes 


K'—knL + Lk 1 or — K' = kn -f k' 


(20.28) 


Substituting this into (20.27) transforms the latter into an equation entirely in per capita 
terms: 


k' = y - c - (n + S)k = tp(k ) - c - (n + S)k (20.27') 


Let U(c) be the social welfare function (expressed in per capita terms), where 

U\c) > 0 and U'\c) < 0 
and, to eliminate corner solutions, we also assume 

U'(c) oo as c -* 0 
and U'(c) -+ 0 as c oo 

If ft denotes the social discount rate and the initial population is normalized to one, the 
objective function can be expressed as 


V = 


CO 

U{c)e- p, L(,e” l dl 


o 


0\c)e (e n) 'dt 


SC 

U(c)e~ n dt where r = p - n 

In this version of the neoclassical optimal growth model, utility is weighted by a population 
that grows continuously at a rate of n. However, if r = p- n > 0, then the model is math¬ 
ematically no different from one without population weights but with a positive discount 
rate r. 

The optimal growth problem can now be stated as 



Maximize 

f U(c)e~ rl dt 

Jo 

subject to 

r = 4>(k) - c - (n + S)k 
fr(0) = £o 

and 

VI 

u 

VI 

o 

where k is the state variable and c 

is the control variable. 


(20.29) 


’ In this model we assume labor force and population to be one and the same. 
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The Hamiltonian for the problem is 

H - U(c)e ~" + A[0(A) -£■-(«+ S)k] 

Since H is concave in c, the maximum of H corresponds to an interior solution in the con¬ 
trol region [0 < c < /(A)], and therefore we can find the maximum of H from 


or 


d/I 

Jc 


= U\c)e- r ' - a = 0 


U'(c) = ke rl 


(20.30) 


The economic interpretation of (20.30) is that, along the optimal path, the marginal utility 
of per capita consumption should equal the shadow price of capital (A) weighted by e r! . 
Checking second-order conditions, we find 

- U"(c)e- rl < 0 

6c- 


Therefore, the Hamiltonian is maximized. 

From the maximum principle, we have two equations of motion 


OX 

OH 

and X’ =-= -X[<p'(k )-(/?+ ill 

ok 

The two equations of motion combined with the U'(c) = Xe rl should in principle define 
a solution for c y k>X. However, at this level of generality we are unable to do more than un¬ 
dertake qualitative analysis of the model. Anything more would require specific forms of 
both the utility and production functions. 


The Current-Value Hamiltonian 

Since the preceding model is an example of an autonomous problem (t is not a separate 
argument in the utility function or state equation but appears only in the discount factor), 
we may use the current-value Hamiltonian written as 

H r = He rT = U(c) + ii [<f>(k) -c-(n+ H)k] [see (20.17)] 

where y = Xe n . 

The maximum principle calls lor 

0 H 

— = W(c)-n = 0 or fi = U\c) (20.31) 

dc 

0 H 

A' = — = 0(A) - c - (« + 5)A (20.3V) 

,L ' = + rfl = ~ < ' n+ ^ + r l l 

— ~ii[<p'(k) — (« + 3 + r)] (20.31'') 

Equations (20.31') and (20.31") constitute an autonomous differential equation system. 
This makes possible a qualitative analysis by phase diagram. 
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FIGURE 20.4 


Constructing a Phase Diagram 

The variables in the differential equations (20.3 T) and (20.31") are A and /r. Since (20.31) 
involves a function of c, namely U'{c), rather than the plain c itself, it would be simpler to 
construct a phase diagram in the kc space rather than the kfi space. To do this, wc shall try 
to eliminate ji. Since /x = T/Yc), by (20.31), differentiation with respect tor gives us 

n' = U"(c)c' 

Substituting these expressions for // and /x' into (20.31") yields 

c' = -^- } W(k)-^+S + r)] 

which is a differential equation in c. We now have the autonomous differential equation 
system 

k' = 0(A) - c - (n + 8)k (20.31 ') 

and c '= + ( 20 - 32 ) 

U"(c) 

To construct the phase diagram in the kc space, we first draw the k' = 0 and c' = 0 
curves which are defined by 

c = 0(A) - (* + S)k (A' = 0) (20.33) 

and 0'(A) = n + & +r {c — 0) (20.34) 

These two curves are illustrated in Fig. 20.4. The equation for the k! = 0 curve, (20.33), has 
the same structure as the fundamental equation of the Solow growth model, (15.30). Thus 
the A' = 0 curve has the same general shape as the one in Fig. 15.5b. The c = 0 curve, on 
the other hand, plots as a vertical line because given the model specifications 0'(A) > 0 and 
0"(A) < 0, 0(A) is associated with an upward-sloping concave curve, with a different 
slope at every point on the curve, so that only a unique value of A can satisfy (20.34), The 
intersection of the two curves at point £ determines the intertemporal equilibrium values of 
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FIGURE 20.5 


k and c. because at point E, neither k nor c will change in value over time, resulting in a 
steady state . We could label these values as k and c for intertemporal equilibrium values, 
but we shall label them as k* and c* instead, because they also represent the equilibrium 
values for optimal growth. 

Analyzing the Phase Diagram 

The intersection point E in Fig. 20.4 gives us a unique steady state. But what happens if we 
are initially at some point other than El Returning to our system of first-order differential 
equations (20.3 T) and (20.32), we can deduce that 


— = -1 < 0 and 
dc 


He' 

Jk 


U'(c ) 
U"(c) 


<t>"(k) < 0 


Since <ik'/dc < 0, all the points below the k' = 0 curve are characterized by k' > 0 and all 
the points above the curve by k' < 0. Similarly, since dc'/i)k < 0, all the points to the left 
of the o' = 0 line are characterized by o' > 0 and all the points to the right of the line by 
o' < 0. Thus the k' = 0 curve and the o' = 0 line divide the phase space into four regions, 
each with its own distinct pairing of signs of o' and k'. These are reflected in Fig. 20.5 by 
the right-angled directional arrows in each region. 

The streamlines that follow the directional arrows in each region tell us that the steady 
state at point £ is a saddle point. If we have an initial point that lies on one of the two sta¬ 
ble branches of the saddle point, the dynamics of the system will lead us to point E. But any 
initial point that does not lie on a stable branch will make us either skirt around point £, 
never reaching it, or move steadily away from it. If we follow the streamlines of the latter 
instances, we will eventually (as t oo) end up either with k = 0 (exhaustion of capital) 
ore = 0 (per capita consumption dwindling to zero)—both of which arc economically un¬ 
acceptable. Thus, the only viable alternative is to choose a (£, c) pair so as to locale our 
economy on a stable branch—a "yellow brick road,” so to speak—that will take us to the 
steady state at £. We have not explicitly talked about the transversality condition, but if we 
had. it would have guided us to the steady state at E, where the per capita consumption can 
be maintained at a constant level ever after. 
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20.6 Limitations of Dynamic Analysis _ 

The static analysis presented in Part 2 of this volume dealt only with the question of what 
the equilibrium position will be under certain given conditions of a model. The major query 
was: What values of the variables, if attained, will tend to perpetuate themselves? But the 
attainability of the equilibrium position was taken for granted. When we proceeded to the 
realm of comparative statics, in Part 3, the central question shifted to a more interesting 
problem: How will the equilibrium position shift in response to a certain change in a para¬ 
meter? But The attainability aspect was again brushed aside. It was not until we reached the 
dynamic analysis in Part 5 that we looked the question of attainability squarely in the eye. 
Here we specifically ask: If initially we are away from an equilibrium position—say. 
because of a recent disequilibrating parameter change—will the various forces in the 
model tend to steer us toward the new equilibrium position? Furthermore, in a dynamic 
analysis, we also learn the particular character of the path ( whether steady, fluctuating, or 
oscillatory) the variable will follow on its way to the equilibrium (if at all). The significance 
of dynamic analysis should therefore be self-evident. 

However, in concluding its discussion, we should also take cognizance oi'lhe limitations 
of dynamic analysis. For one thing, to make the analysis manageable, dynamic models are 
often formulated in terms of linear equations. While simplicity may thereby be gained, the 
assumption of linearity will in many cases entail a considerable sacrifice of realism. Since 
a time path which is germane to a linear model may not always approximate that of a non¬ 
linear counterpart, as we have seen in the price-ceiling example in Sec. 17.6, care must be 
exercised in the interpretation and application of the results oflinear dynamic models. In 
this connection, however, the qualitative-graphic approach may perform an extremely valu¬ 
able service, because under quite general conditions it can enable us to incorporate nonlin¬ 
earity into a model without adding undue complexity to the analysis. 

Another shortcoming usually found in dynamic economic models is the use of constant 
coefficients in differential or difference equations, Inasmuch as the primary role of the 
coefficients is to specify the parameters of the model, the constancy of coefficients—again 
assumed for the sake of mathematical manageability essentially serves to “freeze’’ the eco¬ 
nomic environment of the problem under investigation. In other words, it means that the en¬ 
dogenous adjustment of the model is being studied in a sort of economic vacuum, such that 
no exogenous factors are allowed to intrude. In certain cases, of course, this problem may not 
be too serious, because many economic parameters do tend to stay relatively constant over 
long periods of time. And in some other cases, we may be able to undertake a comparative- 
dynamic type of analysis, to sec how the time path of a variable will be affected by a change 
in certain parameters. Nevertheless, when we are interpreting a time path that extends into the 
distant future, we should always be careful not to be overconfident about the validity of the 
path in its more remote stretches, if simplifying assumptions of constancy have been made. 

You realize, of course, that to point out its limitations as we have done here is by no 
means intended to disparage dynamic analysis as such. Indeed, it will be recalled that each 
type of analysis hitherto presented has been shown to have its own brand of limitations. As 
long as it is duly interpreted and properly applied, therefore, dynamic analysis—like any 
other type of analysis—can play an important part in the study of economic phenomena. In 
particular, the techniques of dynamic analysis have enabled us to extend the study of opti¬ 
mization into the realm of dynamic optimization in this chapter, in which the solution we 
seek is no longer a static optimum state, but an entire optimal time path. 
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Mathematical Symbols 


1. Sets 

a 6 S 

bfS 

ScT 

TdS 

AUB 

AC) B 

S 

{ } or0 
k b,c\ 

{x | x has property P} 
min {a, h 4 c] 

R 

R 2 

R" 

(*,y) 

(x,y, z) 

(«. b) 
k b] 


a is an element of (belongs to) set S 

b is not an element of set S 

set S i s a subset of (is contained in) set T 

set ^includes set S 

the union of set A and set B 

the intersection of set A and set B 

the complement of set S 

the null set (empty set) 

the set with elements a, b, and c 

the set of all objects with property P 

the smallest clement of the specified set 

the set of all real numbers 

the two-dimensional real space 

the «-dimensional real space 

ordered pair 

ordered triple 

open interval from a to b 

closed interval from a to b 


2. Matrices and Determinants 

A'o:A r 
A 1 

Ml 

\J\ 

m 

\ti\ 
r{A) 
t xA 


the transpose of matrix A 
the inverse of matrix A 
the determinant of matrix A 
Jacobian determinant 
Hessian determinant 
bordered Hessian determinant 
the rank of matrix A 
the trace of A 
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0 


u ■ v 
u'v 


null matrix (zero matrix) 

the inner product (dot product) of vectors u and v 

the scalar product of two vectors 


3. Calculus 


Given y = f(x), a function of a single variable x: 

lim f\x) the limit of /( x) as a* approaches infinity 

X—'CXj 



the first differential ofy 
the second differential of v 


dy_ 

dx 


or f(x) 


dy 

dx 


or /'(.Vo) 

Xo 


7? oir(s) 


d"y 

dx" 


or f\x) 


the first derivative of the function y = f ix) 
the first derivative evaluated at x = 
the second derivative of y = f(x) 
the nth derivative ofy = f{x) 


f(x) dx 
b 


indefinite integral of f{x) 

definite integral of fix) from x = a to a = b 
Ja 

Given the function y = f(x\,X 2 , ...,x„): 

the partial derivative of / with respect to x, 
the gradient of/ 

the total derivative of / with respect to*; 
the partial total derivative of / with respect to.v, 


/(*) dx 


h' , 

t] or./, 
dx. 


V/ - grad f 

dy_ 

dx, 

§y 


4. Differential and Difference Equations 



the time derivative of v 

Ay, 

the first difference of y, 

eN 

<1 

the second difference of v t 

» 

particular integral 

y c 

complementary function 

5. Others 

n 

E*. 

the sum of x ( as i ranges from 1 to n 
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P=*q 

p only if q (p implies q) 

p^q 

p if q (p is implied by q) 

p&q 

pifand only \fq 

iff 

if and only if 

|m| 

the absolute value of the number m 

n\ 

n factorial = n{n - 1)(h — 2) • • (3)(2)(1) 

log b x 

the logarithm of x to base b 

\og e x or lnx 

the natural logarithm of* (to base e) 

e 

the base of natural logarithms and natural 


exponential functions 

sin# 

sine function of 6 

COS0 

cosine function off? 


R n the remainder term when the Taylor series involves 

an flth-degree polynomial 



A Short Reading List 


Abadie, J. (ed.): "Nonlinear Programming, North-Holland Publishing Company, Amsterdam. 
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Koopmans, T. C. (td.): Activity Analysis of Production and Allocation, John Wiley & Sons, 
Inc., New York, 1951, reprinted by Yale University Press. 1972. (Contains a number of 
important papers on linear programming and activity analysis.) 

..: Three Essays on the State of Economic Science, McGraw-Hill Book Company, 
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Answers to Selected 
Exercises 


Exercise 2.3 

1. (a) [x | X > 34) 

3. (fl) {2,4,6,71 (c) {2,6} (e) {2} 

8 , There are 16 subsets. 

9. Hint: Distinguish between the two symbols £ and 

Exercise 2.4 

1. (a) {(3, o), (3, b), (6,«), (6, b), (9, a), (9, 6)} 

3. No. 

5. Range = {y | 8 < y < 32) 

Exercise 2.5 

2 . (a) and ( 6 ) differ in the sign of the slope measure: { a ) and (c) differ in the vertical 
intercept. 

4. When negative values are permissible, quadrant 111 has to be used too. 

5. (a) x 1 * 

6 . (a) x 6 

Exercise 3.2 

1. P“ = 2-pp, and Q* — 14-^ 

3. Note: In 2(a), c = 10 (not 6 ). 

5. Hint: b J rd = 0 implies d = —b. 

Exercise 3.3 

1. (a) .r‘ = 5, andx 2 * = 3 

3. (a) (x - 6 )(x + l)(x - 3) = 0, orx 3 - 8 x 2 +9x + 18 = 0 
5. (u) -1,2, and 3 (c) -1, and -5 
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Exercise 3.4 

3. p,* = 3i p; = 3i e; = n* e: = 8T3 

Exercise 3.5 


1. (ft) r = ft - M + ft + Go)/[l - 6(1 - 01 

T - [rf(l - ft) + /(a + ft + Gq)]/[ 1 - ft(l - t)) 

C =[a-bd + h (I ~ r)(ft + G(i)]/[1 - ft(l - ft] 

3. Hint: After substituting the last two equations into the first, consider the resulting 
equation as a quadratic equation in the variable w = Y l/2 . Only one loot is acceptable, 
w, — 11, giving K* — 121 and C* - 91. The other root leads to a negative C‘. 

Exercise 4.1 

1. The elements in the (column) vector of constants are: 0, a, —c. 


Exercise 4.2 


l. (a) 


7 3 
9 7 


ft') 


21 -3 
18 27 


3. In this special case, AB happens to be equal to BA = 


4. (ft) 


'49 3' 


3x + 5y 

4 3 

\ c ) 

4x + 2 v - 7z 


12 x 2) 


12 x I) 


1 0 0 
0 1 0 
0 0 1 


6 . ft) x 2 +x 3 +x 4 +xs ft) ft(xi + x 2 + Xj +x 4 ) 

7. (ft) £ «,(x, +1 + i) ft) Hint: x" - 1 forx ^ 0 

f=2 


Exercise 4.3 


‘15 5 -5" 


xf X 1 X 2 * 1*3 

3 1 -1 

ft) xx' = 

X}X\ x\ 

9 3 -3 

1 

1 

* 

* 

K> 

H 

1 _ 


ft) u'v = 13 
3. ft) iPtQt 

,=i 

5. ft) 2i> = ^ 
o 

7. ft) d = -JTi 
9. ft) d(v, 0) = 


ft) u'u = 35 

(ft) P ■ Q or P’Q or Q P 

t , r 51 

ft) u - v = _ 2 

(w) [ r- 


Exercise 4.4 


l. ft) 


5 17 
1 17 
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2, No; it should be A - B = -5 + A. 

4. (a) k(A + B) = k[au + b,j] = [ka tj + kb,j] - [ ka tj ] + [kb u ] = k[a fJ ] 4- k[bij] - 
kA -\- kB (Can you justify each step?) 


Exercise 4.5 


1. (a) Ah = 


-1 5 7 

0-2 4 



3. (a) 5 x 3 (c) 2 x I 

4. Hint: Multiply the given diagonal matrix by itself, and examine the resulting product 
matrix for conditions for idempotency. 


Exercise 4.6 


1. A' = 


0 -I 
4 3 


and B' - 


3 0 
S 1 


3. Hint: Define D = AB , and apply (4.11). 

5. Hint: Define D = AB, and apply (4.14). 


Exercise 5.1 

1. {a) (5.2) (c) (5.3) (e) (53) 

3. (a) Yes. (d) No. 

5. (a) r(A) = 3; A is nonsingular. ( b ) r{B ) = 2; 3 is singular. 


Exercise 5.2 

1. (a) -6 (c) 0 

3. |A4I = '~ 1 


(e) 3 abc c 3 

d f 

g ‘ 


|C fr | = - 


4. («) Hint: Expand by the third column. 

5. 20 (not -20) 


Exercise 5.3 

3. (a) Property IV ( b ) Property III (applied to both rows). 

4. (a) Singular. (c) Singular. 

5. (a) Rank < 3 (c) Rank < 3 

7. A is nonsingular because \A\ = 1 - b 0. 


Exercise 5.4 

1. |Cj2I E a 2> \ c 4j\ 

1=1 y=i 

3. (a) Interchange the two diagonal elements of A ; multiply the two off-diagonal 
elements of A by -1. 

{b) Divide by \A\. 
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4. (a) 


3 

2 

- 3 " 


"l 

0 

o' 

-7 

2 

7 ! 

(c) G-' = 

0 

0 

1 

-6 

-4 

26 J 


0 

1 

0 


Exercise 5.5 

1 . (a) = 4, andx 2 = 3 (c) x* = 2, and x 2 * = 1 


, 1 

-1 

CO 

. - 1 

4 

i— 

■ 

7 

i - 

1 

CO 

l_ 

\X = 

i 

i_ 


(c) A 1 = 


1 

15 


i 

7" 


'2' 

-l 

8 

J 

;* - 

1 


3. (a) x,* -2,x\ = 0,xj = 1 (c) *• = 0,/ = 3,z* = 4 

4. /finA- Apply (5.8) and (5.13). 

Exercise 5.6 

1 I -h 

6(1-0 1 -h 


1. (a) A~ l = 


r 

C 

r 


1 — b+ bt 


l -b + bt 


t 1 - b 
lo + Gtj+a - bd 

b(\-t)(I 0 + G 0 ) + a-bd 
t(k+G 0 )+at+d(\ -b) 

I A 1 1 = Iq + Gij — bd + a 


(b) Ml = 1 -b + bt 

M 2 1 = a — bd + b( 1 — 0(/o + (jo) M 3 1 = tf(l — b) + t(a + 0 + Go) 

Exercise 5.7 


1. xf = 69.53, x 2 * = 57.03, and JC 3 * =42.58 


‘0.10 0.50' 


0.90 

-0.50' 

X] 


' 1,000' 

0.60 0 

; the matrix equation is 

-0.60 

1.00 

Ml 


2,000 


(c) x\ — 3,333j, andx 2 = 4,000 

4. Element 0.33: 33d of Commodity II is needed as input for producing SI of Commodity 1. 

Exercise 6.2 

1. (a) Ay/ Ax = 8x + 4Ax (6) dy/dx = 8x (c) f'(3) = 24, /'(4) = 32 
3. (a) A>’/Ax = 5; a constant function. 


Exercise 6.4 

1. Left-side limit = right-side limit = 15; the limit is 15. 

3. (a) 5 [b) 5 

Exercise 6.5 

1. (a) -3/4 < x (c) x < 1/2 

3. (a) -7 < x < 5 (c) -4 < x < 1 

Exercise 6.6 

1. (a) 7 (c) 17 

3. (a) 2i (c) 2 
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Exercise 6.7 

2. (a) N 2 -5N-2 ( b ) Yes. (c) Yes. 

3. (<?) (.V + 2}/(,V- + 2) (b) Yes. (c) Continuous in the domain. 

6. Yes; each function is continuous and smooth. 

Exercise 7.1 

1. (a) dy/dx = 12-c 11 (tj dyjdx = 35x 4 (e) dw/du — -2n _l/2 

3. (a) f{x) - 18; f'(l) = f‘(2) = IS 
(c) /'(x) = 10,-^;.Al)=10,r(2 t=li 

Exercise 7.2 

1. VC - Q i - 5 Q 2 -f 12(2; - IQ 2 - 100 + 12 is the MC function. 

3. («) 3(27jc 2 + 6x - 2) (c) 12x(x + 1) (e) -x(9x + 14) 

4. (/;) MR = 60-60 

1. (a) (x 2 - 3)/x 2 (c) 30/(x + 5) 2 

8. (a) a (c) -a /{ax + b) 2 

Exercise 7.3 

1. —2jt[ 3(5 - x 2 ) 2 + 2] 

3. (a) I8x(3x 2 - I3) 2 (c) 5 aiux+b) 4 

5. x = \y-3,dx/dy = ± 

Exercise 7.4 

1. (a) ilv/Oxi = 6x[ - 22 xix 2 . and dv/dx 2 =-1 lx 2 + 6x 2 
(c) dv/dx 1 = 2(x 2 - 2), and dy/‘dx 2 = 2x\ + 3 
3. (o) 12 (c) 10/9 

5. («) 0, = 2(.t| + 2)(x2 + 3) 3 .and U 2 = 3(x, + 2) 2 (x 2 + 3) 2 

Exercise 7.5 

1. 3Q‘/da = d/{b + d) > 0 3Q*/3b = -d{a + c)/(b + d) 1 <0 
9 Q*/dc = -bfb+d)< 0 3 Q*/3d = b(a + c)/(b + d) 2 > 0 

2. a Y*/dh = a r/sa = i/( 1 - ff + ps) > 0 

Exercise 7.6 

1. (a) |./| — 0; the functions are dependent. 

(b) ,/ = -20x2; the functions are independent. 

Exercise 8.1 

1. (a) dy = -3(x 2 -I- l)c/x (c) dy = [(1 -x 2 )/(x 2 + l) 2 Ji/x 

3. (a) dC/dY = b,anAC/Y = (a + bY)/Y 

Exercise 8.2 

2 . (a) dz = (6x + y)dx + (x - (iy 2 )dy 



Answers to Selected Exercises $67 


3. (a) dy = [.t 2 /(xi +xi) z \dx\ - [x,/(jc i +x 2 ) 2 ]dx 2 

4. s QF =2bP 2 /(a + bP 2 + R i '' 2 ) 

6. Sxf = -2/(Y} I2 P 2 +]) 

Exercise 8.3 

3. (a) dy = 3[(2.t 2 - l)(x 3 + 5)rf.ri + 2^i(.r 3 + 5 )dx 2 + jci(2jc 2 - 1 )dx 2 ] 

4. Hint: Apply the definitions of differential and total differential. 

Exercise 8.4 

1. (a) dzjdy =x + 10y + 6y 2 = 28y -f-9v 2 
(c) dzjdy = -\5x + 3 y = 108v - 30 

3. dQ/dt = [aaA/K + bfiA/L + A'{t)]K a L* 

4. (b) §»7§u = 10u/i + f 2 §(f /§v = 3/| - \2v 2 f 2 

Exercise 8.5 

5. (a) Defined; dy/dx = ~(3x 2 - Axy + 3/)/(-2.r 2 + 6xy) = -9/8 

(b) Defined; dy/dx = ~(4x + 4y)/(4x - 4y 3 ) =2/13 

1. The condition F y / 0 is violated at (0, 0). 

8. The product of partial derivatives is equal to -1. 

Exercise 8.6 

1. (e) {dY'/dGo) = 1/(5' + V - l') > 0 

3. (3 P*/BY 0 ) = DyJ(S P , - D P .) > 0 (3 Q*jdY,) = D Yo S P .j(S P . - D P >) > 0 

(BP*/BT a ) = -S T J(S P > - D P 0 > 0 (3 Q’/dTo) = -S Ta D P -j{S P ■ - Dpi) < 0 

Exercise 9.2 

1. (a) When x = 2, y — 15 (a relative maximum). 

(c) When x = 0, y = 3 (a relative minimum). 

2. {a) The critical value x = -1 lies outside the domain; the critical value .v = 1 leads to 
v = 3 (a relative minimum). 

4. {</) The elasticity is one. 

Exercise 9.3 

1. (a) f(x) = 2a, f"'(x) = 0 (c) /"(*) - 6(1 - x)"\ /'» = 18(1 - *T 4 

3. (b) A straight line. 

5. Every point on j\x) is a stationary point, but the only stationary point on g(.r) 
we know of is at x = 3. 

Exercise 9.4 

1. (a) f{ 2) = 33 is a maximum. 

(e) /(l) = 5y is a maximum; /(5) = -5y is a minimum. 

2. Hint: First write an area function^ in terms of one variable (either Lor W) alone, 

3. {cl) Q* = 11 (e) Maximum profit = Illy 
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5. (a) k < 0 (ft) h < 0 (c) j > 0 

7. (ft) S’is maximbed at the output level 20.37 (approximately). 

Exercise 9.5 

1. (a) 120 (c) 4 (e) (n+2)(n-hl) 

2. (a) 1 + x +x 2 +x 3 +-r 4 

3. (ft) -63 - %x - 62x 2 - 18x 3 - 2.r 4 + R A 

Exercise 9.6 

1. (a) /'(0) = 0 is an inflection point. (c) /(0) = 5 is a relative minimum, 

2. (ft) /'(2) = 0 is a relative minimum. 

Exercise 10.1 

1. (a) Yes. (ft) Yes. 

3. (a) 5e* (c) -12f 2 ' 

5. (a) The curve with a = -1 is the mirror image of the curve with a = 1 with reference 
to the horizontal axis. 

Exercise 10.2 

1. (a) 7.388 (ft) 1.649 

2. <c) 1 + 2.v 4- —■ (2Jt) 2 + T.(2j:') 3 + ■ ■ • 

3. (a) $70e 0,12 (ft) $690e°- 10 

Exercise 10.3 

1. («) 4 (c) 4 

2. (a) 7 (c) -3 (e) 6 

3. (a) 26 (c) In 3 ^ In S ( /) 3 

Exercise 10.4 

1. The requirement prevents the function from degenerating into a constant function. 

3. Him: Take log to base ft. 

4. (a) y = ^ |I|S) ' or >> = e™" (c) y = 5t- (ln5)r or y = 5e xm$ ‘ 

5. (a) r = (Iny)/(ln7) on = 0.5139 In y 

(c) t = 3ln(9v)/(ln 15) orr = I,l0781n(9v) 

6. (a) r = In 1.05 (c) r = 21n 1.03 

Exercise 10.5 

1. (a)2e 2l + A (c) 2//+ 1 (e) (lax + by- f2+hx+t 

3. (a) 5jt (c) lj{t + 19) (e) 1/[*(1+.t)] 

5. Him. Use (10.21), and apply the chain rule. 

7. (a) 3(8 — .t 2 )/[(.v + 2) 2 (* + 4) 2 J 
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Exercise 10.6 

1 . r = 1 /r 2 

2. d 2 A/dt 2 = -^(In2)/4V? < 0 

Exercise 10.7 

1. (a) 2ft (c) In b (e)l/f-ln3 
3. r v = kr x 
7. \s d \=n 

11 . rg = SqkI'k + £QL r L 

Exercise 11.2 

1. z* = 3 is a minimum. 

3. z* = c, which is a minimum in case (a), a maximum in case ( b ), and a saddle point in 
case (c). 

5. (a) Any pair (jc, v) other than (2, 3) yields a positive z value. 

(b) Yes. (<■) No. (d) Yes (d 2 z = 0). 

Exercise 11.3 

1. (a) q -4u 2 + Aim + 3 1 ) 2 (c) q - 5x 2 + 6xy 

3. (a) Positive definite. (c) Neither. 

5. (a) Positive definite. (c) Negative definite. (e) Positive definite. 

6 . (a) r\, r 2 = \{1 ± \/T7); u'Du is positive definite. 

(f) r\,r 2 = j(5 ± v/61"); u'Fu is indefinite. 


ijS 


-1/V5 

MS. 

> ^2 - 

. VS . 


Exercise 11.4 

1. z* = 0 (minimum) 

3. z* = -11 /40 (minimum) 

4. z* = 2 - e (minimum), attained at (jc + , /, w*) = (0.0.1) 

5. ( b) Hint: See (11.16). 

6. (a) rj = 2 r 2 = 4 + «/6 r 2 = 4 — 

Exercise 11.5 

1. (a) Strictly convex. (c) Strictly convex. 

2. (a) Strictly concave. (c) Neither. 

3. No. 

5. (a) Disk. ( b ) Yes. 

7. (a) Convex combination, with 9 = 0.5. ( b ) Convex combination, with 9 = 0.2. 
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Exercise 11.6 

1. («) No. (b) Q\ = P, 0 /4 and Q\ = P 2l) f 4 

3. \Sin | = 11 It'fi’l —lj 1 = 15 

5. (a) 7T = PiiQ(a, b)(\ + j/o) -2 - P u o« - Pmb 

Exercise 11.7 

1. (da*/dP a0 ) = P 0 Qbhe- Ft /\J\ < ft (3b*/dP a o) = -PoQ^nJl < 0 

2. M Four. (b) (3a*/HP Q )=(Q h Q al ,-Q a Q M )P 0 (l + i 0 )- 2 /\J\>Q 
(c) (3flVt»/'o) = (QaQbb - QhQ„b)P^ +io) _3 /l^l < 0 

Exercise 12.2 

1. (<z) z* = 1 /2, attained when k* = 1 /2, x* = 1. and / = 1 /2 
(c) z* = -19, attained when A.* = -4,.r* = 1, andy* = 5 

4. — -G{.r, v) = 0 Z x = f x - kG x = 0 2, = /, - AG, = 0 

5. Hint: Distinguish between identical equality and conditional equality. 

Exercise 12.3 

1. (a) \ll\ = 4; z* is a maximum. (c) \H\ = -2; z* is a minimum. 

Exercise 12.4 

2. (a) Quasiconcave, but not strictly so. (c) Strictly quasiconcave. 

4. (a) Neither. (t) Quasiconvex, but not quasiconcave. 

5. Hint: Review Sec. 9.4. 

7. Hint: Use cither (12.21) or (12.25'). 

Exercise 12.5 

1. (b) k* = 2.x* - 16, y* = 11 (c) |/7| =48; condition is satisfied. 

3. (dx*/i)B) = 1/2 P r > 0 (3x*/dP x ) = -(« + P y )/2P* < 0 

(dx r /dPy)=\/2P x >Q etc. 

5. Not valid. 

7. Notoboth(u) and (b) —see{12.32) and(12.33'). 

Exercise 12.6 

1. (a) Homogeneous of degree one. (r) Not homogeneous. 

(<?) Homogeneous of degree two. 

4. They are true. 

7. (ti) Homogeneous of degree a + b + c. 

8 - («) j 2 Q=gUKJD W Hint: Let j = \/L. 

{d) Homogeneous of degree one in K and L. 
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Exercise 12.7 

1. (a) 1 : 2 : 3 (b) 1:4:9 

2. Hint: Review Figs. 8.2 and 8.3. 

4. Hint: This is a total derivative. 

6. (a) Downward-sloping straight lines. (b) a -> c*c as p -»• -1 
8. (a) 7 (c) In 5 - 1 

Exercise 13.1 

3. The conditions x,{0Z/9x ; ) = 0 and the conditions k;(i)Z /9 a,) = 0 can be condensed. 

5. Consistent. 

Exercise 13.2 

1. No qualifying arc can be found for a test vector suchas(r/xi, dxj) = (1,0). 

3. (x*, x/) = (0, 0) is a cusp. The constraint qualification is satisfied (all test vectors are 
horizontal and pointing eastward); the Kuhn-Tucker conditions are satisfied, too. 

4. All the conditions can be satisfied by choosing yl = 0 and y] > 0. 

Exercise 13.4 

2. (a) Yes. (b) Yes. (c) No. 

4. (a) Yes. (b) Yes. 

Exercise 14.2 

1. (a) -&x~ 2 + c,(x ^0) (c) ^x 6 - jX 2 4- c 

2. (a) 13 e x +c (c) 5e x - 3x~' + c, (x ^ 0) 

3. (n) 3 In \x | + c, (x # 0) (c) ln(x 2 + 3) + c 

4. ( a ) 2(x + l) 3 /2(x + 3)--i(jr + l) s ^+ t - 

Exercise 14.3 

(b)l\ (^|+tj 

2. (a) \{e~ 2 - e~ 4 ) (c) e 2 (\e A - \q 2 + e - 1) 

3. ( b ) Underestimate. (e) f{x) is Riemann integrable. 

Exercise 14.4 

1. None. 

2. (fl),(c),(f/) and O). 

3. (a), (c) and(d) convergent; (e) divergent. 

Exercise 14.5 

1. (a) R(Q) = 140 2 - f e 0 - 3 *? + f (&) R(Q) = 106/(1 + 0 
3. {a) K(t) = 9 1^ + 25 

5. (a) 29,000 
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Exercise 14.6 

1. Capital alone is considered. Since labor is normally necessary for production as well, 
the underlying assumption is that K and L are always used in a fixed proportion. 

3. Mini: Use (6.8). 

, u 

4. Hint: In u - In u = In - 

i.' 

Exercise 15.1 

l.(a) y(t) = -e~* + 3 fo)y(r)=f(l- ( >- ,n ') 

3 . («) y(t) = 4(1 - e~') (c) y(t) - 6e‘” (e) y(t) = 8e 7 ' - 1 

Exercise 15.2 

1. The D curve should be steeper. 

3. The price adjustment mechanism generates a differential equation. 

5. ( u) P{t ) = /l exp (b) Yes. 

V n / P+b 

Exercise 15.3 

1. >-(r) = Ae~ Sl + 3 
3. v(() = e~‘ 2 + \ 

5. v(0 = < J 6 ‘ - W 

6 , Hint: Review Sec. 14.2, Example 17. 

Exercise 15.4 

1 . (a) y(f) = (cji } y i2 (c) yl + y 2 t = c 

Exercise 15.5 

dy i 

1. (a) Separable; linear when written as — + -y = 0 

(c) Separable; reducible to a Bernoulli equation. 

3. y(t) = {A - r) 1 ' 2 

Exercise 15.6 

1. fo) Upward-sloping phase iinc; dynamically unstable equilibrium. 

(c) Downward-sloping phase line; dynamically stable equilibrium. 

3. The sign of the derivative measures the slope of the phase line. 

Exercise 15.7 

1. r k =r K -r L [c/. (10.25)] 

4. («) Plot (3 - y) and lny as two separate curves, and then subtract. A single 

equilibrium exists (at ay value between 1 and 3) and is dynamically stable. 
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Exercise 16.1 

1. (a) y p = 2/5 (e) y p = 3 (e) y p = 6t z 

3. (a) v(0 - 6?' + e -4 ' - 3 (c) y(t) = e'+te' + 3 

6. Apply L’Hopital’s rule. 

Exercise 16.2 

1, (a) § ± |V3/ (c) ± JV7/ 

3. (6) Hint: When 0 = tt/ 4, line OP is a 45° line. 

5. («) —sin f(9)= f'(9)cosf(e) (b) — cost? 2 = -30 2 sin6 3 
dO ad 

7. (a) V3 + i (c) 1 - i 

Exercise 16.3 

1. v(f) = f 2 '(3cos2/ + | sin2r) 

3.,<0 = « W (-C0S^<+^ S in^l)+3 
S. y{t) = | cos 3/ + sin 3r + } 


Exercise 16.4 

i.(„) P "+?LJLp'-e±i = - a -+y 

n — w n — w n — w 
3. (a) P(t) = e‘ /2 (2cos \t + 2 sin \t) + 2 


(« # < 


(*) ^ = 


a + y 

m 


Exercise 16.5 

1. (a) % +;(1 -g)=j(a - T - 0U) 

at 

(, b ) No complex roots; no fluctuation. 


3. (e) Both are first-order differential equations. 

A n 

A, (a) jr(t) =e I cos — t + Af, sin — t I +■ m 


id) g * 1 

(c) p = m -u = LA m 


Exercise 16.6 

2. (a) y p = t-2 (c) y p = \e' 

Exercise 16.7 

1. (a) y p =4 (c) y p = ^r 2 

3. (a) Divergent. (c) Convergent. 


Exercise 17.2 

1, (o) y t+ 1 = y, + l (c) >V+ 1 = 3y, -9 

3. (a) y, = 10 + t (c) y t = y$a' - f${[ + a: 4-a 2 H-ha' - ') 
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Exercise 17.3 

1. (a) Nonoscillatory; divergent. (c) Oscillatory; convergent. 

3. (a) y, = -8(1/3)' 4- 9 (c) y, = -2(-l/4)' +4 

Exercise 17.4 

1 . Q I= j-fi(P 0 -P)(-S/^ 

3. (a) P = 3; explosive oscillation. (c) P = 2; uniform oscillation. 

5. The lag in the supply function. 

Exercise 17.5 

1 . a = -i 

3. P, = (P 0 - 3)(-1.4)' + 3, with explosive oscillation. 

Exercise 17,6 

1. No. 

2. (ft) Nonoscillatory, explosive downward movement. 

( d ) Damped, steady downward movement toward R. 

4. (ci) At first downward-sloping, then becoming horizontal. 

Exercise 18.1 

l.(a)\±U (cH-1 

3. (a) 4 (stationary) (c) 5 (stationary) 

4. (/>) y, = \/2'^2cos^r-)-siii^^ + 1 

Exercise 18.2 

1. (ci) Subcase ID. (e) Subcase 1C. 

3. Hint: Use (18.16). 

Exercise 18.3 

3. Possibilities v. viii,x, and xi will become feasible. 

4. (a) pn2-[2-jd ~ g) ~ Pk]p,+ i + [I - ,/(l - g) ~ $k(\ - j)]p, = jfikm 

W P* | 4 

Exercise 18.4 

1. (a) 1 (c)3/ 2 + 3/+1 

3. (a) y P = \t (c) y p = 2-t + t 2 

5. (a) 1/2.-land 1 
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Exercise 19.2 

2. P + b 1 - 2b + 2 = 0 

3. (a) x, = -(3)‘ + 4(-2)‘ + 7 y t = 2(3)' + 2(-2)' + 5 

4, (a) x(t) = 4e~ 21 - 3e“ 3 ' + 12 y(t) = -e” 2 ' + ^- 31 + 4 

Exercise 19.3 

2. (c) t-) = (&l-A)- l u 

3. (c) fi = {pl + F-A)- [ X 

5, (c) -ci(r) = 4<r 4 ' /ici + 2t , - ll '' 1 ° + ^e'' /in ; x 2 (0 = 3«- 4f/10 -2^ ll ' ,10 +^ 


Exercise 19.4 




A, 

/33+yi93\ 

4 2 

u c _ 

— 

' 23 - ^193 

64 + 

! 23 + a/193 

i 4 



_ 48 ' 4| _ 

i- -i 

\ / 

7 

L 48 J 


n 


0 


33-VTwY 
64 J 


Exercise 19.5 

1, The single equation can be rewritten as two first-order equations. 

2. Yes. 

4. (a) Saddle point. 

Exercise 19.6 

1. (a) \Je\ = 1 and tr Jg = 2; locally unstable node. 

(c) \Jg \ =5 and tr Jg = — 1; locally stable focus. 

2. (a) Locally a saddle point. (e) Locally stable node or stable focus. 

4. (a) The x' = 0 and y' = 0 curves coincide, and provide a lineful of equilibrium 
points. 

Exercise 20.2 

] — t t t 2 

i. r = i-t — v 1 = - - — + 2 

2 '24 

6. X*(i) = 3e 4 ~ r - 3 «*(/) = 2 /(f) = 7e‘-2 

Exercise 20.4 

1. r = <5/(6 2 + o) /T = l/2(<5 2 + a) 
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Discount factor, 266 
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dynamic stability of equilibrium 
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Distinct real roots, 507-508, 570-571 
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Dorfman. R,, 45n 

Double roots, 508 

Dual problems, 435-441 
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Dunn, Sarah, 79n 

Dynamic analysis, limitations of. 654 
Dynamic equations 

high-order, transformation of, 
593-594 

simultaneous, solving, 594 603 
Dynamic instability. 497 
Dynamic optimization, 442,631 
Dynamic stability, 497 
Dynamic stability of equilibrium. 
481-482 

with continuous time, 510,525-527 
with discrete time, 551-554,573-575 
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Dynamic lability of equilibrium 
(C onliiwi'd) 

local stability of nonlinear system. 
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of Cobb-Douglas function. 396 
Elimination of variables, 33-34, 

Hi, 116 

Endogenous variables 

exogenous variables vs., 5-6 
Jacobian determinant, 203.208^212. 
343 344,353 

Enthovcn,A. C, 369n, 425n, 426n 
Envelope theorem, 428-441 
for constrained optimization, 

432 433 

derivation of Roy ’s identity 
and, 437-438 


maximum-value functions and, 

4 2S^4-35 

for unconstrained optimization, 

428-432 

Equality 

matrix, 51,56 
of sets, 10 
Equation* s) 

auxiliary, 506 
behavioral, 6 7 
Bernoulli. 493,494, 501 
characteristic. See Characteristic 
equations 
conditional, 7 
costatc, 633,634,638 
cubic. 35n 
definitional, 6 

differential. See Differential equation 
exponential, 268, 271 
homogeneous, 476,478 
of motion, 631,633-034 
nonhomogeneous, 476-478 
quadratic See Quadratic equation 
reduced, 477 
state, 633-634,644-645 
Equation system 

consistency and independence in, 
44-45, 85 

dynamic. See Simultaneous 
difference equations; 
Simultaneous differential 
equations 

homogeneous, 105-106, 119-120, 
595,598 

linear, 48, 77 78, 106 107 
Equilibrium. 30-47 
defined, 30 

dynamic stability of. Sec Dynamic 
stability of equilibrium 
general, 40 45 
goal, 31,220 
intertemporal, 480, 481 
moving vs. stationary, 482 
in national-income analysis, 46-47 
open-economy. 214 216 
partial, 31,43 
types of, 618 -620 

Equilibrium analysis. Sec Static analysis 
Equilibrium condition, 7 
Equilibrium identity. 206, 208, 211,212 
Equilibrium output, 236 
Equilibrium values, 32 
Euclidean w-space, 60,64.65 
Euler relations 517-519 


Eulers theorem, 385-386, 388-389 
Exact differential equation, 4X6 490 
Excess demand. 31,41 

output adjustment and. 605-607 
price adjustment and. 480 
in relation to inventory, 559-560 
Exchange rate, fixed, 214 
Exhaustible resource, 647 649 
Exogenous variables. 5-6 
exp. 259 

Expansion path, 392-394 
Expectations 

adaptive, 533. 558. 581 
inflation, 533, 536,581 
price, 527-528, 558 

Expcclations-uuginenled Phillips 
relation, 533 534,581 
Expected rate of inflation. 536 
Expected utility from playing, 232 
Explosive fluctuation. 525-526 
Explosive oscillation. 566, 596 
Exponent* s). 21,23 24.256 
Exponential equation. 268. 271 
Exponential function**), 22, 23, 255. 
256-267 

base conversion of 274 276 
base of. 256,259 
derivative of, 278 280 
•discounting and, 266 
generalized, 257-259 
graphical form of, 256-257 
growth anil 260-267 
interest compounding and 262 263 
logarithmic function* and, 272 273 
Maclaurin series of. 261 
natural, 259 

Exponential-function rule 
of differentiation. 278 
of integration. 448 
Exponential law of growth, 255 
Exports, net. 213 
Extreme value, 221.293-301 
Extremum. 221 

absolute vs. relative, 222 223, 
291,319,347 
constrained, 362, 372-374 
determinantaI test for constrained 
extremum, 362 
de term mania) test for relative 
constrained extremum, 362 
dctcrminantal test for relative 
extremum, 317 
first-order condition for, 313 
global vs. local. 222 223 
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in relation to concavifv and 
convexity, 318-320 
in relation to quasiconcavity anti 
quasiconvexity, 372-374 
strong vs. weak, 318 

F 

Factors) 

discount, 266 
integrating, 489-490 
Factorial, 243 
Factoring 

of determinant vs. matrix. 95 
of integrand, 450 
of polynomial function, 38-39 
Fair bet. 232 
Fair game, 232 
Final demand, 113 
Finite Markov chains* 80 
Finite set. 9 

First-derivative test. 223-226 
First-order condition, 234, 

294-295,402 

derivative vs. differential form of, 
293-292, 293 
for extremum, 313 
necessary vs. sufficient, 295 
Fiscal policy, 534 
Fixed exchange rale, 214 
Fixed terminal point. 639 
Flow concept, 264,466-467 
Fluctuation 

damped, 526,561 
explosive, 525-526 
stepped 574 575,579,580,584 
time path with, 525-527, 534-537 
uniform, 526 
Focus, 618-619 
Form,301 
Formby, J. R, 240n 
45—degree line. 564 
Fraction, 7 
Free optimum, 347 
Friedman, M„ 533 
Function(s), 17 28 

algebraic vs. nonalgebraic, 23 
argument of, 18 
circular, 23,513-515 
Cobb-Douglas. See Cobb-Douglas 
production function 
complementary. See Complementary 
functions 


concave vs. convex, 230-231. 
318-320 

constant. 20, 21, 148 149,187 
consumption, 46. 576 
continuous vs. discontinuous, 
141-142 

continuously differentiable. 

154, 227 

cubic, 21,22,35n, 38,238-242 
decreasing vs. increasing, 163 
defined, 17 
derivative, 127 

differentiable. 324-327, 368-372 
domain of. 18,19 
exponential. See Exponential 
function(s) 

general v$. Specific, 27-28 
graphical form of, 22, 516 
Hamiltonian. See Hamiltonian 
function 

homogeneous, See Homogeneous 
functions 

homothetic, 394- 395 
implicit, 194-199 
inverse, 163,272.622 
Lagrangian. See Lagrangian 
functions 
linear, 21,22,27 
logarithmic. See Logarithmic 
functions 

maximum-value, 428-435 
objective, 221,313-317,632, 644 
polynomial See Polynomial 
functions 

production. See Production functions 
profit, 429-430 
quadratic, 21,22,27, 35-36 
quasi concave vs. quasiconvcx, 
364-371 
range of, 18,19 
rational, 21-23, 142-343 
saddle point or, 295, 299. 302 
Sinusoidal, 514 
social-loss, 69 
step, 131, 552 
Taylor scries of, 624 
transcendental, 23 
trigonometric, 23, 514 
of two variables, extreme values 
of, 293-301 
value of, 18,19 
zeros of, 36 

Fundi on-of-a-function rule, 162, See 
also Chain rule 


G 

Gamkrelidze, R. V, 633n 
General-equilibrium analysis, 43 
Giffen goods, 381 
Global extremum. 222 223 
Goal equilibrium, 31, 220 
Greek alphabet, 655 
Gross investment, 466 
Growth 

continuous vs. discrete, 265-266 
Domar model of, 471 -474.475 
exponential functions and 260 267 
exponential law of, 255 
instantaneous rate of, 263-265. 
286-288 
negative, 266 

neoclassical optima) model of 
649-651 

rate of, 263 265,286-288 
Solow model of, 498-502,652 


H 

Hamiltonian function 
current-value, 645, 651 
for optimal control problems, 633. 
634, 635-638. 641,642,651 
Hawkins, D., 116n 
Hawkins-Simon condition, 116 
economic meaning of. 118-119 
principal minor and, 304, 305.306, 
314 

Hessian determinant. 304, 314, 316 
bordered, 358-363.371-372,439n 
Jacobian determinant in relation to. 
343-344 

Hessian matrix, 314 315 
Hicksian demand functions, 436 
Homogeneous equation, 476,478 
Homogeneous-equation system, 

105-106, 119-120.595,598 
Homogeneous functions 

economic applications of. 382, 
383-391) 

linearly. 383-386,388-389 
llomothetic function. 394-395 
Horizontal intercept. 274 
Horizontal terminal line. 639. 640-643 
Hotelling’s lemma, 430.432, 438 
Hyperbola, rectangular, 21-23,561,580 
Hypersurfuee, 26 
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I 

/, the number, 511 
Idcmpotcnt matrices 71,73,78 
Identity. ( 

equilibrium, 206, 208.211.212 
Roy's, 437 43 X, 440 
Identity matrix, 55,69,70-71 
Image. IS. See ak<> Mirror images 
Imaginary axis, 512 

Imaginary number, 511 
Implicit function, 194-199 
Implicit-function rule, 197-198, 

202, 387 

Implicit-function theorem, 196,198n, 
199 200,201 

application procedure, 210-217 
applied to national-income models, 
203-204,2(0-213 
applied to optimization models, 
343-345. 353-354, 37S 
Income effect, 380,381 
Income increment, 547 
Indefinite integral. 446 454,460 
Independence. See Dependence 
Independent variable, 18 
Indifference curve, 375-378 
Induced investment. 576 
Inequality, 136-139 

absolute values and, 137 I3X 
continued, 136 
rules of. 136 
sense of, 136 
solution of, 138-139 
Inequality constraints. 404-408 
Inferior good 379 
Infinite integrand. 463*464 
Infinite series, 261,517-519 
Infinite set, 9 

Infinite lime hon/on. 649 653 
Inllation, 533 

actual vs. expected rate of. 536 
monetary, 629 

unemployment and 532-537. 
581-585.609 614 
Inflation expectations, 533, 536. 581 
Inflection point. 225, 231,234n, 

252, 295 

Initial condition, 445 
Inner product, 54 
Input coefficient. 113 
Inpui-eoctYicicnt matrix. 113-114 
Input decision, 336-341 
Input-decision model. 343-345 
Input demand, 113 


Input-output model 
closed 119-120 
dynamic, 603-609 
Leonti ef, 112 121 
open. 113 116 
static, 112-121 

Instantaneous rate of change, 126 
Instantaneous rate ol growth. 263 265, 
286 288 
Integers, 7 
Integral, 446,475 

definite, 447, 454 461 
economic applications of, 464-470 
improper, 461-464 
indefinite, 446-454,460 
lower vs. upper, 457 
of a multiple, 450 451 
particular. .Vet Particular integral 
Riemann. 457,459 
of a sum, 449-450 
Integral calculus. 445 
Integral sign, 446 
Integrand 446 
factoring of, 450 
infinite. 463-464 
Integrating factor, 489 490 
Integration, 445 
constant of 446 
dynamics and, 444-446 
limits of, 454,460,461-463 
by parts, 452-453,460 
Integration rules 

exponential rule, 448 
integration by parts. 452 453, 460 
logarithmic rule, 448 
power rule, 447 
rules of operations, 448-451 
s ubsti tuli on rul c, 451 452 
Intercept 

horizontal, 274 
vertical 21 

Interest compounding, 262-263 
Interior solution, 403 
Intersection set 11 
Intertemporal equilibrium, 480,481 
interval, closed vs. open, 133 
Invariance property, 382 
Inventory, marie el model with, 559 562 
Inverse. 56 

Inverse function, 163.272, 622 
Inverse matrices 
finding, 99-103 
propertiesoI, 75-77 
solution of linear-equation system 
and, 77-78 


Investment, 211,471 474 

capital formation and 465^67 
dynamics of. 498-502 
gross. 466 
induced 576 
net, 466,467 
replacement, 466 
Irrational number, 8 
Isocost, 391 

Isoquant, 339-34*1. 391,392 394 
Isovaluc curves, 392n 
Iterative method, for difference equation, 
546-548 


J 

Jacobian determinant, 45 

endogenous-van able, 203. 208,212, 
343 344. 353 

in relation to bordered I less!an. 359 
in relation to Hessian, 343-344 


K 

Keynes, J. M„ 46,576 
Keynesian multiplier. 576 
Kuhn, H. W., 402n, 424 
Kuhn-Tueker conditions, 402 412 
economic interpretation of. 408 409 
ell eels of inequality constraints, 
404-408 

minimization version of, 410 
optimal control theory and, 640 
Kuhn-Tucker sufficiency theorem, 

424 425 


L 

Lag 

in consumption, 576 
in production, 603-605 
in supply, 555 
Lagrange, J, I.., 126-127 
Lagrange form of the remainder, 
248-249 

Lagrange multiplier 
current-value. 645 

economic interpretation of, 353 354, 
375,391 

general interpretation of, 434-435 
Lagrange-multiplier method 
350-352. 353 
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Lagrangian functions 

in finding stationary values, 
350-352,354-355 
in nonlinear programming, 403, 
409,410 

Laplace expansion 

by alien cofactors. 99-100 
evaluating an wth-order determinant 
by, 91 93 
Latent root. 307n 
Layson, S., 240n 
Least-cost combination of inputs, 
390-401 

Leibniz, G, W., 127 

Lconticf, W. W., 112 

Lconticf input-output models, 112-121 

Leontief matrix, 115,116 

UHopitals rule, 399,400 

Lifetime utility maximization, 645-647 

Limit. 129-135 

evaluation of, 131-132 
formal view of. 133-135 
of integration, 454,460,461 463 
letl-sidcvs. right-side, 129-131 
of polynomial function. 141 
Limit theorems, 139—141 
Linear approximation, to a 
function, 246-248 
Linear combination, 61,62 
Linear constraints. 416^418 
Linear dependence, 62-63 
Linear-equation system, 48, 
77-78,106-107 
Linear form, 301 
Linear function, 21,22, 27 
Linear programming, in relation to 
nonlinear programming, 402 
Lineari7ation, See Linear approximation 
Linearly homogeneous functions, 

383 386,388-389 
Linearly homogeneous production 
functions, 384-386 
Literary logic, 3 
In, 268 

Local extremum, 222-223 
Log. See Logarithm!s) 

Logarithm! s), 48^49, 257.260-272 
common vs. natural, 268-269 
conversion formulas. 271 
elasticity and. 289 
meaning of, 267-268 
rules of. 269-271 

Logarithmic functions, 22,23, 272-277 
base of, 267-269 

exponential functions and, 272-273 


Logarithmic-function rule 
of dilTerentiation, 277 278 
of integration, 448 
Logic, mathematical vs. literary, 3 


M 

Machlup, F., 30 ji, 444n 
Maelaurin series, 242-243 
convergent, 261 
of cosine function, 518 
of exponential function, 261 
of polynomial function, 242-243 
of sine function, 518 
Mapping, 17-18 
Marginal cost 

average cost vs., 159-160 
total cost vs., 128-129,153, 
464-465 

Marginal physical product. 198 
diminishing, 340,499 
of labor, 163 

Marginal product, value of. 339 
Marginal propensity to consume, 

46,211,547 

Marginal propensity to save, 465 
Marginal rate of substitution, 375 
Marginal rate of technical 
substitution, 391 
absolute value and, 199 
elasticity of substitution and, 396n 
Marginal revenue 

average revenue vs.. 156-158 
upward-sloping, 240-241 
Marginal revenue product, 163 
Marginal utility of money, 375 
Market models, 31-44, 107-108 
comparative statics of, 205-207 
dynamics of, 479-483, 527-532, 
555-562, 565-567 
with inventory, 559-562 
Market price, dynamics of, 479-483. 

527-532,555-562,565-567 
Markov chains, 78-81 
absorbing, 81 
Unite, 80 

Markov transition matrix, 79-80 
Marshallian demand, 435, 437,438, 439 
Mathematical economics 
defined, 2 
econometrics vs.. 4 
nonmathematical economics 
vs., 2-4 

Mathematical logic, 3 


Mathematical model, 5-7 
Mathematical symbols, 656-658 
Mathematically binding solution, 420 
Matrices, 49-5 9 

addition of, 51- 52. 67 
as arrays, 49-50 
characteristic, 308 
coefficient, 50 
cofactor, 100 
defined, 50 
diagonal, 69, 73 
diagonalization of, 310-311 
dimension of, 50, 53 
division of, 56 
echelon, 86-87 
elements of, 50 
equality, 51,56 
factoring of. 95 
Hessian. 314-315 
idempotent, 71.73,78 
identity. 55,69,70-71 
inverse, 75-78,99-103 
laws of operations on, 67-70 
lead vs. lag, 53,54 
Lconticf. 115, 116 
Markov transition. 79-80 
niultiplication of, 53-56.58. 
59-60.68-69 

nonsingular. See Nonsingularity 

null, 71 72 

rank of, 85-87,97-98 

scalar multiplication of, 52 

singular, 72, 75 

square, 50, 88,96 

subtraction of, 52, 67 

symmetric, 74 

transpose, 73-74 

vectors as, 50-51 

zero, 71 72 

Maximum. See Extremum 
Maximum principle, 633-639 
Maximum-value functions, 428-435 
McShane, E. J M 253n 
Mean-value theorem, 248 
Metric space, 65 
Minhas,B. S., 397n 
Minimization of cost, 390-401 
Minimisation version of Kuhn*Tucker 
conditions. 410 
Minimum. See Extremum 
Minor 

bordered prin cipal ,361 362 
principal, 116 118,304.305, 
306,314 

Mirror effect, 554,556 
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Mirror images 

in bordered Hessian. 363 
in exponential and log functions, 
273-274 

in symmetric matrix, 74 
in time paths, 554 
Mishchenko. Ii. 633n 

Mixed partial derivatives, 296 
Models and modeling 
closed 1)9 120 
of closed economy. 309-111 
cobweb, 555-558 
economic, 5-7 
market. See Market models 
mathematical 5 7 
national-income. See National- 
income models 
open, 113-116 
Modulus, 137.512 
Monetary policy, 534, 5 X1 
Monetary rule. 629 
Money, marginal utility of, 375 
Money illusion, 381 
Motion, equation of, 631, 633 634 
Multieonslraint cases, 354 355,362 363 
Multiple roots. 508 
M u It i p I icativc c on stant ,153 
Multiplier 

interaction of, with accelerator, 
576-581 
Keynesian. 576 

Lagrange. See Lagrange multiplier 
Multiproduct firm, 331-333,342-343 

N 

/?-space, 60. 64,65 
a-variable, 307, 354-355 
n -vector, 60 

National-income models, 46 47.108 109 
comparative statics of, 210-213 
dynamics of. 576-581 
equilibrium in analysis of 46-47 
implicit-function theorem applied to, 

203-204,210-213 
Natural exponential function, 259 
Necessary condition, 82-84,234-235, 
237, 357-358,424 

Ncccssary-and-sufficient condition, 83, 
84,425 

Negative area. 458 
Negative definiteness, 306 
conditions for, 307, 311 
definite vs. indefinite, 302 


Negative growth, 266 
Negative semi de Urn ten ess 
conditions for, 311 
definite vs. semidefinite, 302 
Neighborhood, 133-134 
Neoclassical optimal growth model, 
649-651 
Nerlovc. M., 558 
Net exports. 213 
Net investment, 466,467 
Neyman, J., 402n 
Node. 618.626,627, 629 
Nonalgcbraic function, 23 
Nonconstant solution, 478 
N onconvergent time path, 526 
Nondenumerable set, 9 
Non goal equilibrium. 31 
Nonhomogencous equation, 476*^78 
Nonlinear programming, 356n 
constraints in, 404-408 
economic applications of, 418-424 
in relation to linear 
programming, 402 
sufficiency theorems in, 424-428 
Nonmathematical economics, vs. 

mathematical economics, 2-4 
Nonnegative solution, 116 118 
Nonnegativity restriction, 402 403 
Nomingularity, 75 

conditions for, 84-85,96-97 
test of, 88-94 

Nontrivial solution, 106, 600 
Normal good, 379 
Normalization 

of characteristic vector, 308 
of differential equation. 475n 
N th-dmvativc test. 253-254 
Null matrix. 7J-V 
Null set, Hi 
Hull vector, 61. 62 63 


O 

Objective function, 221 

with more than two variables, 
313-317 

in optimal control theory, 632, 644 
Obst.N, P„ 629 

Official settlement, change of, 2!4n 
Onc-to-onc correspondence, 16. 60, 
163,165 

Open-economy equilibrium, 214—216 
Open input-output model, 113 116 
Open interval, 133 


Operator symbol, 149 
Optimal control 

illustration of, 632-633 
nature of, 631-639 
Optimal control theory, 631-654 
alternative terminal conditions 
and, 639 644 

autonomous problems in, 644-645 
economic applications 
of 645 649 

Rontryagin’s maximum principle 
in, 633-639 
Optimal growth model. 

neoclassical 649- 651 
Optimal input, elasticity of, 395 
Optimal timing. 282 286 
Optimization. See also Constrained 
extremum 

constrained, 432-433 
dynamic. 442, 631 
maximization and minimization 
problems and, 221 
unconstrained, 428-432 
Optimization conditions. 7 
Optimum, constrained vs. Ircc, 347 
Optimum output, 236 
Optimum value, 221 
Ordered n-tuple, 50 
Ordered pair, 15-16.17 
Ordered sets, 15 
Ordered triple, 16 
Ordinate, 36 
Orthant. 369 
Orthogonal vectors, 309 
Orthonormal vectors, 310 
Oscillation, 552. 565 
explosive. 566. 596 
time path with, 556-558, 561-562, 
565-567 


P 

Parabola. 21 
Parallelogram, 61-62 
Parameter, 6 

distribution. 397 
efficiency, 388. 397 
substitution. 397 
Partial derivative 
cross (mixed). 296 
second-order, 295 297 
Partial elasticity, 186.187 
Partial equilibrium, 31,43 
Partial total derivative. 192. 193 
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Particular integral 

of first-order difference equation. 549 
of first-order differential equation, 
477, 478 

of higher-order difference equation, 
569-570 

of higher-order differential equation, 
504-505 

intertemporal equilibrium and, 

481, 504 

of simultaneous difference 
equations, 597 
of simultaneous differential 
equations, 599 
of variable-term difference 
equation. 586-588 
of variable-term differential 
equation, 558-540 
Payoff, 231 
Perfect foresight, 537 
Period. 516, 544 
Period analysis. 544 
Perpetual flow, present value of, 470 
Phase. 516 
Phase diagram 
analyzing, 653 
constructing, 652-653 
for difference equation. 562-567 
for differential equation, 495 498, 
500-501 

for differential-equation system. 

614 623 

dynamic stability of equilibrium and, 
495-498. 562-565,619-620 
Phase line, 495,56.3, 565 
Phase path, 617 
Phase space, 615 
Phase trajectory. 617 
Phillips, A. W., 532n 
Phillips relation, 532 533 
cxpcctations-augmcntcd, 

533-534, 581 
long-run, 537, 585 
Point concept of time, 264 
Point elasticity, 288 289 
Point of expansion, 242 
Polar coordinates, 520 
Polynomial equations 
higher-degree. 38-40 
roots of. 38 40.541 
Polynomial functions, 20-21 
continuity of, 142 
degree of, 21 
factoring of. 38 39 
limit of. 141 


Maclaurin series of, 242 243 
Taylor series of, 244- 245 
Pontryagin, L. S.. 63 3n 
Pontryagin’s maximum principle, 
633-639 

Positive definiteness, 306 
conditions for, 307,311 
definite vs. indefinite, 302 
Positive integers, 7 
Positive semi definiteness 
conditions for, 311 
definite vs. semidefinite, 302 
Power-function rule, 149-152 
in finding total differential, 187 
of integration, 447 
Power senes, 242 
Present value, 266 

of cash flow', 468-469 
of perpetual flow, 470 
Price, time path of, 529 -532 
Price ceiling, 566 
Price discrimination, 333-336 
Price expectations, 527-528,558 
Primal problem. 435 
Primary input, 113 
Primitive function, 126 
Principal diagonal, 55 
Principal minor, 116-118 
bordered, 361-362 
Ilawkins-Simon condition and, 304, 
305,306,314 
Produel 

Cartesian, 16 
direct, 16 
inner, 54 
marginal, 339 

marginal physical, 163, 198, 

34U, 499 

marginal revenue, 163 
scalar, 60,66 

Product limit theorem. 140 
Product rule, 155 156,187 
Production functions 
CtS, 397-399 

Cobb-Douglas. See Cobb-Douglas 
production function 
linearly homogeneous, 384-386 
strictly concave function applied 
to, 341 

strictly qua si concave function 
applied to. 392 

Profit, maximization of, 235-238 
Profit function, 429-430 
Proper subset, 10 
Pythagoras’ theorem, 65,512,635 


0 

Quadratic equation 

quadratic function vs.. 35-36 
roots of, 36. 38-40, 507-510 
Quadratic forms, 301 
constrained, 358-359 
n- variable. 307 

sign definiteness of characteristic- 
root test. 307-311 
sign definiteness of determinants I 
test, 302 304 
three-variable, 305-307 
Quadratic formula. 36-37 
Quadratic function, 21,22.27. 35 36 
Qualify ing arc, 4 15 .416 
Qualitative information. 157,207 
Quantitative information. 157.207 
Quantity discount, 13 In 
Quasiconcave function, 364-371. See 
also Strictly quasi concave 
function 

CES function as. 398 
criteria for checking, 367-371 
explicitly, 372-373, 378 
in nonlinear programming, 425 426 
Quasi con cave programming, 425 
Quas icon vex function, 364-371 
criteria for checking, 367-371 
in nonlinear programming, 426 
Quotient, difference, 125-126 
Quotient limit theorem, I4Q 
Quotient rule. 158 159,187 

R 

Radian, 514 515 
Radius vector, 60 
Range, 18,19 
Rank, 85-87,97-98 
Rate of change, 125 
instantaneous;, 126 
proportional, 286n 
Rate of decay, 266 
Rate of growth 
finding. 286-288 
instantaneous, 263 -265,286-288 
Ration constraint, 418-420 
Rational function, 21-23 
continuity of, 142-143 
defined, 21 
Rational number. 8 
Razor’s edge, 473-474 
Real line, 8 
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Real number system, 7-8 
Real roots, 507-509 

distinct, 507-508, 570-571 
repeated 508-509. 57], 579,553 
Reciprocal, 56 

Reciprocity conditions, 430-432 
Rectangular hyperbola, 

21-23. 561, 580 
Reduced equation, 477 
Reduced-form solutions, 342-343 
Reduced linearization, 625 
Relation, 16 

Relative extremum, 222- -223,291,347 
determ inanlal test for, 317 
Taylor series and, 250-253 
Remainder 

Lagrange form of, 248 249 
symbol for, 245n 

Repeated real roots, 5OX-500, 57 L 
579,583 

Replacement investment, 466 
Resource, exhaustible, 647 649 
Restraint, 348. See also Constraint 
Returns to scale 

constant. See Constant returns to 
scale (CRl'S) 

decreasing and increasing, 390, 40 i 
Ridge lines, 339 
Riemann integral, 457.459 
Risk, attitudes toward. 231-233 
Roots 

characteristic. See 
Characteristic roots 
complex. 507-510, 512 513. 

572-573. 579 
dominant, 574 

of polynomial equation, 38 40, 541 
of quadratic equation. 36, 38-40, 
507-510 

real, 507-509, 570-571,579.583 
Routh theorem, 542-543, 590 
Row vector, 50, 53, 55 
Roy’s identity. 437-438,440 

S 

Saddle point 

of dynamic system, 618 
of function. 295, 299, 302 
stable and unstable branches of, 618 
Samuelson, P. A., 45n, 542n, 576 
Saving function, 185.465 
Scalar. 52,59-60 


Scalar multiplication, 52 
Scalar product, 60, 66 
Scale effect, 553,554 
Schur theorem, 598-599 
Second derivative, 227-233 
Second-derivative test, 233 234.252 
Second-order condition, 298-300, 
313-316 

derivative vs. differential form of. 
292-293 

necessary vs. sufficient, 234-235, 
298, 299.357-358 
in relation to concavity and 
convexity, 318-33 I 
in relation to quasiconcavity and 
quasiconvexity, 364 374 
role of, in comparative statics. 345 
Second-order total differential, 297-298, 
30! 302. 356-357 
Semilog scale, 287n 

Series. See also Maclaurin series; Taylor 
series 

convergence of, 249,261 
infinite, 261, 517-519 
power. 242 
Set(s). 8-14 

complement of 12 

denumerable vs. 

nondcnumcrable, 9 
disjoint. 11 
empty, 10 
equality of, 10 
finite vs, infinite. 9 
intersection of 11 
laws of operations on, 12-14 
null, 10 

operations on, 11-14 
ordered. 15 

relationships between, 9-11 
subset, ID 
union of, 11 
universal, 12 
Set notation, 9 
Shephard's lemma, 438—441 
Side relation, 348. See also Constraint 
Sign definiteness 

characteristic-root tot for, 307 311 
determinants! test for, 302-304 
positive and negative, 302 
Silvcrberg, E.. 428n 
Simon, H. A.. 116n 
Simultaneous difference equations 
applied 603-609.612-613 
solving, 594 596 


Simultaneous dillcrcntial equations 
applied 605 607,610-612.614 
solving. 599 -601 
S i m u 1 taneo u s- cqu ati on approach. 

207-209 

Sine function, 514 
derivative of. 517 
properties of, 515-517 
table of values of. 515.520 
Singular matrix, 72, 75 
Sinusoidal function, 514 
Slope, 21 

Slulsky equation. 380 
Smith, W. J., 240n 
Social-loss function. 69 
Solow.R. M..45n, 397n. 474, 498 
So low' growth model. 498-502,652 
Solution, 33 34 

boundary vs. interior. 403 
economically nonbinding, 420 
of inequality, 138 139 
mathematically binding. 420 
nonconslant, 478 
nonnegative, 116-118 
nontrivial, 1 06,600 
outcomes for linear-equation system. 

106-107 

reduced-form. 342-343 
trivial. 105 

verification of 478-479 
Square matrix, 50.88,96 
State equation, 633 634, 644-645 
State variable. 631,633 
Static analyst* 

Leon lief input-output models. 

112-121 

limitations of, 120 121 
Statics, 31 See also Comparative statics 
Stationary equilibrium, 482 
Stationary point, 224 
Stationary state, 501 
Stationary values, 224, 349-355 
Steady state, 501 
Step function, 131,552 
Stock concept, 264,466 
Streamlines. 617-618 
Strictly concave functions, 318-320 
applied to production functions, 34] 
criteria for checking. 320-324 
defined, 230 
strict vs. nonstrict, 318 
Strictly convex I unctions 

applied to indifference curves. 

376 377 
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applied to isoquants, 341 
criteria for checking, 320 324 
defined 230 
strict vs. nonslrict, 318 
Strictly quasiconcavc function. 364-371 
applied to production function, 392 
applied to utility function* 377 
Cobb-Douglas function as, 386 
criteria for checking, 367-371 
Strictly quasiconvex ftmetion, 364 371 
criteria for checking, 367 371 
strict vs. non strict, 364 
Subset. 10 

Subsidiary condition. 348. See also 
Constraint 

Substitutes, 41.333,337,338 
Substitution 

elasticity of, 396, 397 
marginal rate of. 375 
technical, marginal rate of, 199* 
391,396n 

Substitution effect, 380 381 
Substitution parameter, 397 
Substitution rule. 451-452 
Suen. W„ 428n 

Sufficiency theorems, 424 428 
Sufficient condition, 82 84. 234-235, 
357-358,424,425 
Sum-di (Terence limit theorem, 140 
Sum-difference rule, 152-155, 187 
_ notation. 56-58 
Sum of squares. 60,69 
Summand, 57 
Summation index. 57 
Summation sign, 56-58 
Supply, 31.32.35 
lagged, 555 

with price expectations, 527 
Surface, 25 

concave or convex, 365 
hypersurface, 26 
utility, 377-378 
Symbols 

mathematical, 656-658 
operator, 149 
for remainder, 245n 
Symmetric matrix, 74 


T 

Takayama, A.. 118n, 369n 
Tangent function, 514 


Taylor series, 242 
convergent, 249 
of functions, 624 
of polynomial functions, 244-245 
relative extremum and, 250-253 
with remainder, 245 
Taylor's theorem. 245 
Terminal conditions, alternative, 639 644 
Terminal line 
horizontal, 639 
truncated horizontal, 640-643 
truncated vertical. 639-640 
Terminal point, fixed, 639 
Test vector. 415,416 
Time hon/on, infinite, 649-653 
Time path 

convergent, 526 

with fluctuation, 525-527, 534-537 
mirror images in, 554 
non convergent (divergent), 526 
non oscillatory and 
nonfluctuating, 579 
with oscillation, 556-558, 
561-562.565 S67 
phase-diagram analysis of. 5a' Phase 
diagram 

of price, 529-532 
steady, 481,583-584 
with stepped fluctuation, 574-575, 
579. 580, 584 

types of, 496-498, 560, 564-566 
Timing, optimal, 282-286 
Total derivatives, 189-194 

applied to comparative statics, 
209-210 
partial, 192,193 

Total differential, 184—18-7, 352 353 
of saving function, 185 
second-order. 297 298,301-302. 
356-357 

Total differentiation, 185, 190 

Trajectory, 617 

Transcendental function. 23 

Transformation, 17 18,593-594 

Transitivity, 136 

Transpose, 73-74 

Transversal it y condition. 634,637, 

639- 640 

Triangular inequality, 65 
Trigonometric function. 23, 514 
Truncated horizontal terminal line, 

640- 643 

Truncated vertical terminal line, 

639 640 


Tucker, A, W„4Q2n, 424 
Twice continuously differentiable 
functions, 154,227 


U 

Undetermined coefficients, method of, 
538-540,586-588,604,607 
Unemployment 

inflation and 532-537, 581-585, 
609-614 

monetary policy and. 534 
natural rale of, 537, 585 
Uniform fluctuation. 526 
Union set, 11 
Unit circle, 523 
Unit vector. 63 
Universal set, 12 
Utility maximization, 374-382 
comparative statics of. 378 382 
exhaustible resource and. 647-649 
lifetime, 645-647 
Utilization, coefficient of, 473 


V 

Valuc(s) 

absolute. See Absolute value 
critical, 224 
equilibrium, 32 
extreme, 221,293-301 
of function, 18, 19 
of marginal product, 339 
optimum, 221 
present, 266,468-469.470 
stationary. 224, 349-355 
Vanishing determinant, 89,95 
Variable! s). 302 
choice, 221 

continuous vs. discrete, 444 
control, 631 
costate, 633 
defined, 5 

dependent vs. independent. 18 
elimination of. 33-34, 1 II. 116 
endogenous vs, exogenous. 5 6 
exponents as, 256 
state, 631,633 
Vector! s) 

addition of, 61-62 
characteristic. 307. 308 
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Vcctor(s) (Continued) 
column, 50,53.55 
convex combination of, 328 330 
geometric interpretation of, 60 62 
as matrices. 50 51 
null. 61.62 63 
orthogonal, 309 
orthonormal, 310 
radius. <30 
row. 50. 53. 55 
test. 415. 416 
unit. 63 

zero, 61.62 63 
Vector difference, 62 


Vector space. 63 65 
Venn diagram, 12 
Vertical intercept, 21 
Vertical terminal line, truncated, 
639-640 
Vortex. 619,627 

W 

Walras, L., 43,45n 
Weighted average, 32? 

Weighted sum of squares, 69 
Whole numbers, 7 


Y 

Young's theorem, 296,431.432 

Z 

Zero matrix, 71 -72 
Zero-value (vanishing) determinant, 
89,95 

Zero vector. 61,62 63 
Zeros of a function, 36 
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